神经网络的架构如何影响其鲁棒性对嘈杂的标签？

论文标题

神经网络的架构如何影响其鲁棒性对嘈杂的标签？

How Does a Neural Network's Architecture Impact Its Robustness to Noisy Labels?

论文作者

Li, Jingling, Zhang, Mozhi, Xu, Keyulu, Dickerson, John P., Ba, Jimmy

论文摘要

在大型现实世界数据集中，嘈杂的标签是不可避免的。在这项工作中，我们探索了一个以前作品所研究的领域 - 网络的体系结构如何影响其稳健性对嘈杂的标签。我们提供一个正式的框架，将网络的鲁棒性与其架构和目标/噪声功能之间的对齐方式连接起来。我们的框架通过其表示形式中的预测能力来衡量网络的鲁棒性 - 使用一小部分干净标签在学习表示上训练的线性模型的测试性能。我们假设网络对嘈杂标签的体系结构与目标函数的一致性比噪声更加一致。为了支持我们的假设，我们在各种神经网络体系结构和不同领域提供理论和经验证据。我们还发现，当网络与目标函数保持良好状态时，其表示形式的预测能力可以改善最先进的（SOTA）嘈杂标签训练方法，以测试准确性甚至优于使用干净标签的复杂方法。

Noisy labels are inevitable in large real-world datasets. In this work, we explore an area understudied by previous works -- how the network's architecture impacts its robustness to noisy labels. We provide a formal framework connecting the robustness of a network to the alignments between its architecture and target/noise functions. Our framework measures a network's robustness via the predictive power in its representations -- the test performance of a linear model trained on the learned representations using a small set of clean labels. We hypothesize that a network is more robust to noisy labels if its architecture is more aligned with the target function than the noise. To support our hypothesis, we provide both theoretical and empirical evidence across various neural network architectures and different domains. We also find that when the network is well-aligned with the target function, its predictive power in representations could improve upon state-of-the-art (SOTA) noisy-label-training methods in terms of test accuracy and even outperform sophisticated methods that use clean labels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题