选择最低损失的样品使SGD稳健

论文标题

选择最低损失的样品使SGD稳健

Choosing the Sample with Lowest Loss makes SGD Robust

论文作者

Shah, Vatsal, Wu, Xiaoxia, Sanghavi, Sujay

论文摘要

异常值的存在可能会显着偏向通过随机梯度下降（SGD）训练的机器学习模型的参数。在本文中，我们提出了简单的SGD方法的简单变体：在每个步骤中，首先选择一组K样本，然后从这些步骤中选择最小损失的k个样本，并使用此选定的样本进行类似SGD的更新。香草sgd对应于k = 1，即别无选择。 k> = 2代表一种新算法，但是有效地最大程度地减少了非凸替代替代损失。我们的主要贡献是对ML问题的鲁棒性特性的理论分析，这是凸损失的总和。这些通过线性回归和小规模的神经网络实验来备份

The presence of outliers can potentially significantly skew the parameters of machine learning models trained via stochastic gradient descent (SGD). In this paper we propose a simple variant of the simple SGD method: in each step, first choose a set of k samples, then from these choose the one with the smallest current loss, and do an SGD-like update with this chosen sample. Vanilla SGD corresponds to k = 1, i.e. no choice; k >= 2 represents a new algorithm that is however effectively minimizing a non-convex surrogate loss. Our main contribution is a theoretical analysis of the robustness properties of this idea for ML problems which are sums of convex losses; these are backed up with linear regression and small-scale neural network experiments

下载PDF全文

下载文献需遵守相关版权规定

论文标题