论文标题
最坏情况训练的两个维度和综合效果的跨域泛化效果
The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization
论文作者
论文摘要
强调数据的“难以学习”组成部分的培训已被证明是改善机器学习模型概括的有效方法,尤其是在鲁棒性(例如,跨分布的概括)重视的环境中。现有讨论这种“难以学习”概念的文献主要是沿样本的维度或功能的维度扩展。在本文中,我们旨在引入一个简单的视图,以合并这两个维度,从而通过强调样本和特征维度上的最差循环,从而导致一种新的,简单但有效的启发式训练,以训练机器学习模型。我们按照“沿两个维度最差的案例”的概念命名W2D。我们验证了这个想法,并证明了其在标准基准方面的经验强度。
Training with an emphasis on "hard-to-learn" components of the data has been proven as an effective method to improve the generalization of machine learning models, especially in the settings where robustness (e.g., generalization across distributions) is valued. Existing literature discussing this "hard-to-learn" concept are mainly expanded either along the dimension of the samples or the dimension of the features. In this paper, we aim to introduce a simple view merging these two dimensions, leading to a new, simple yet effective, heuristic to train machine learning models by emphasizing the worst-cases on both the sample and the feature dimensions. We name our method W2D following the concept of "Worst-case along Two Dimensions". We validate the idea and demonstrate its empirical strength over standard benchmarks.