公平感知的模型 - 不足的积极和未标记的学习

论文标题

公平感知的模型 - 不足的积极和未标记的学习

Fairness-aware Model-agnostic Positive and Unlabeled Learning

论文作者

Wu, Ziwei, He, Jingrui

论文摘要

随着机器学习在高风险决策问题中的应用不断增加，对某些社会群体的人们的潜在算法偏见对个人和我们的整个社会造成了负面影响。在实际情况下，许多这样的问题涉及正面和未标记的数据，例如医学诊断，刑事风险评估和推荐系统。例如，在医学诊断中，仅记录诊断的疾病（阳性），而其他疾病不会（未标记）。尽管在（半）有监督和无监督的环境中进行了大量的现有工作，但公平问题在上述正面的积极和未标记的学习（PUL）环境中，通常在很严重的情况下进行了很大的探讨。在本文中，为了减轻这种张力，我们提出了一种名为Fairpul的公平意识的PUL方法。特别是，对于来自两个人群的个人的二元分类，我们的目标是在两个人群中达到相似的真实正率和假定的伪造率。基于对PUL的最佳公平分类器的分析，我们设计了模型不合时宜的后处理框架，利用了积极的示例和未标记的示例。从分类错误和公平度量标准方面，我们的框架在统计上是一致的。关于合成和现实世界数据集的实验表明，我们的框架在PUL和公平分类方面的表现都优于最先进。

With the increasing application of machine learning in high-stake decision-making problems, potential algorithmic bias towards people from certain social groups poses negative impacts on individuals and our society at large. In the real-world scenario, many such problems involve positive and unlabeled data such as medical diagnosis, criminal risk assessment and recommender systems. For instance, in medical diagnosis, only the diagnosed diseases will be recorded (positive) while others will not (unlabeled). Despite the large amount of existing work on fairness-aware machine learning in the (semi-)supervised and unsupervised settings, the fairness issue is largely under-explored in the aforementioned Positive and Unlabeled Learning (PUL) context, where it is usually more severe. In this paper, to alleviate this tension, we propose a fairness-aware PUL method named FairPUL. In particular, for binary classification over individuals from two populations, we aim to achieve similar true positive rates and false positive rates in both populations as our fairness metric. Based on the analysis of the optimal fair classifier for PUL, we design a model-agnostic post-processing framework, leveraging both the positive examples and unlabeled ones. Our framework is proven to be statistically consistent in terms of both the classification error and the fairness metric. Experiments on the synthetic and real-world data sets demonstrate that our framework outperforms state-of-the-art in both PUL and fair classification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题