论文标题
潜在的对抗性偏见:减轻深神经网络中的对撞机偏见
Latent Adversarial Debiasing: Mitigating Collider Bias in Deep Neural Networks
论文作者
论文摘要
对撞机偏差是神经网络无法处理的样本选择偏见的一种有害形式。当由于训练数据收集程序而导致的其他混杂信号密切相关时,这种偏见就会显现出来。在混淆信号易于学习的情况下,深度神经网络将锁定在此方面,并且所得模型将概括为野外测试方案。我们在本文中认为,失败的原因是神经网络的深层结构和所使用的贪婪梯度驱动的学习过程的结合 - 该过程更喜欢在可用时易于发出的信号。我们表明,即使在100%的训练数据中存在混杂信号,也可以使用潜在的对抗性偏见(LAD)生成偏置耦合的训练数据来缓解这种情况。通过在这些对抗性示例上训练神经网络,我们可以改善它们在对撞机偏见设置中的概括。实验表明,LAD在不含标签的偏见中的最先进表现,背景有色MNIST的增长率为76.12%,前景有色MNIST的最先进的表现为35.47%,损坏的CIFAR-10的最先进的表现为35.47%。
Collider bias is a harmful form of sample selection bias that neural networks are ill-equipped to handle. This bias manifests itself when the underlying causal signal is strongly correlated with other confounding signals due to the training data collection procedure. In the situation where the confounding signal is easy-to-learn, deep neural networks will latch onto this and the resulting model will generalise poorly to in-the-wild test scenarios. We argue herein that the cause of failure is a combination of the deep structure of neural networks and the greedy gradient-driven learning process used - one that prefers easy-to-compute signals when available. We show it is possible to mitigate against this by generating bias-decoupled training data using latent adversarial debiasing (LAD), even when the confounding signal is present in 100% of the training data. By training neural networks on these adversarial examples,we can improve their generalisation in collider bias settings. Experiments show state-of-the-art performance of LAD in label-free debiasing with gains of 76.12% on background coloured MNIST, 35.47% on fore-ground coloured MNIST, and 8.27% on corrupted CIFAR-10.