论文标题

选择最低损失的样品使SGD稳健

Choosing the Sample with Lowest Loss makes SGD Robust

论文作者

Shah, Vatsal, Wu, Xiaoxia, Sanghavi, Sujay

论文摘要

异常值的存在可能会显着偏向通过随机梯度下降(SGD)训练的机器学习模型的参数。在本文中,我们提出了简单的SGD方法的简单变体:在每个步骤中,首先选择一组K样本,然后从这些步骤中选择最小损失的k个样本,并使用此选定的样本进行类似SGD的更新。香草sgd对应于k = 1,即别无选择。 k> = 2代表一种新算法,但是有效地最大程度地减少了非凸替代替代损失。我们的主要贡献是对ML问题的鲁棒性特性的理论分析,这是凸损失的总和。这些通过线性回归和小规模的神经网络实验来备份

The presence of outliers can potentially significantly skew the parameters of machine learning models trained via stochastic gradient descent (SGD). In this paper we propose a simple variant of the simple SGD method: in each step, first choose a set of k samples, then from these choose the one with the smallest current loss, and do an SGD-like update with this chosen sample. Vanilla SGD corresponds to k = 1, i.e. no choice; k >= 2 represents a new algorithm that is however effectively minimizing a non-convex surrogate loss. Our main contribution is a theoretical analysis of the robustness properties of this idea for ML problems which are sums of convex losses; these are backed up with linear regression and small-scale neural network experiments

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源