随机递归梯度下降，用于随机非convex-rong-concove minimax问题

论文标题

随机递归梯度下降，用于随机非convex-rong-concove minimax问题

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems

论文作者

Luo, Luo, Ye, Haishan, Huang, Zhichao, Zhang, Tong

论文摘要

我们考虑$ \ min _ {\ bf x} \ max _ {\ bf y \ in {\ Mathcal y}} f（{\ bf x}，{\ bf y}）$ f $ f $ f $ y $ bff（ x $和$ {\ Mathcal y} $是凸面和紧凑的集合。我们专注于随机环境，在此环境中，我们只能在每次迭代中访问$ f $的无偏随机梯度估计。该公式包括许多机器学习应用程序，例如特殊情况，例如强大的优化和对手培训。我们有兴趣查找$ {\ Mathcal O}（\ Varepsilon）$ - 函数的固定点$φ（\ cdot）= \ max _ {\ bf y \ in {\ mathcal y}}} f（\ cdot，\ cdot，{\ bf y}）$。解决此问题的最流行算法是随机梯度不错的上升，它需要$ \ MATHCAL O（κ^3 \ varepsilon^{ - 4}）$随机梯度评估，其中$κ$是条件数量。在本文中，我们提出了一种称为随机递归梯度下降（SREDA）的新方法，该方法使用降低方差更有效地估算了梯度。此方法达到了$ {\ Mathcal O}（κ^3 \ Varepsilon^{ - 3}）$的最著名随机梯度复杂性，并且其对$ \ varepsilon $的依赖性对于此问题是最佳的。

We consider nonconvex-concave minimax optimization problems of the form $\min_{\bf x}\max_{\bf y\in{\mathcal Y}} f({\bf x},{\bf y})$, where $f$ is strongly-concave in $\bf y$ but possibly nonconvex in $\bf x$ and ${\mathcal Y}$ is a convex and compact set. We focus on the stochastic setting, where we can only access an unbiased stochastic gradient estimate of $f$ at each iteration. This formulation includes many machine learning applications as special cases such as robust optimization and adversary training. We are interested in finding an ${\mathcal O}(\varepsilon)$-stationary point of the function $Φ(\cdot)=\max_{\bf y\in{\mathcal Y}} f(\cdot, {\bf y})$. The most popular algorithm to solve this problem is stochastic gradient decent ascent, which requires $\mathcal O(κ^3\varepsilon^{-4})$ stochastic gradient evaluations, where $κ$ is the condition number. In this paper, we propose a novel method called Stochastic Recursive gradiEnt Descent Ascent (SREDA), which estimates gradients more efficiently using variance reduction. This method achieves the best known stochastic gradient complexity of ${\mathcal O}(κ^3\varepsilon^{-3})$, and its dependency on $\varepsilon$ is optimal for this problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题