利用焦点损失来对抗浅启发式：自然语言推断的调制跨透明拷贝的经验分析

论文标题

利用焦点损失来对抗浅启发式：自然语言推断的调制跨透明拷贝的经验分析

Using Focal Loss to Fight Shallow Heuristics: An Empirical Analysis of Modulated Cross-Entropy in Natural Language Inference

论文作者

Rajič, Frano, Stresec, Ivan, Marmet, Axel, Poštuvan, Tim

论文摘要

没有完美的数据集。在某些数据集中，深层神经网络发现了基本的启发式方法，使他们可以在学习过程中采取捷径，从而导致概括能力差。我们不用使用标准的跨凝结，而是探讨称为焦点损失的调制版本是否可以限制模型，以免使用启发式方法并改善概括性能。我们在自然语言推论中进行的实验表明，焦点损失对学习过程有正规化的影响，提高了分布数据的准确性，但对分布数据的性能略有下降。尽管有改善的分布性能，但与诸如无偏见的局灶性损失和自我证实的合奏相比，我们证明了局灶性损失及其自卑感的缺点。

There is no such thing as a perfect dataset. In some datasets, deep neural networks discover underlying heuristics that allow them to take shortcuts in the learning process, resulting in poor generalization capability. Instead of using standard cross-entropy, we explore whether a modulated version of cross-entropy called focal loss can constrain the model so as not to use heuristics and improve generalization performance. Our experiments in natural language inference show that focal loss has a regularizing impact on the learning process, increasing accuracy on out-of-distribution data, but slightly decreasing performance on in-distribution data. Despite the improved out-of-distribution performance, we demonstrate the shortcomings of focal loss and its inferiority in comparison to the performance of methods such as unbiased focal loss and self-debiasing ensembles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题