论文标题
浅层恢复网络的最佳凸起功能:重量衰减,深度分离和维度的诅咒
Optimal bump functions for shallow ReLU networks: Weight decay, depth separation and the curse of dimensionality
论文作者
论文摘要
在本说明中,我们研究了如何使用单个隐藏层和RELU激活的神经网络插值数据,该数据是从径向对称分布中的,目标标签1处的目标标签1和单位球外部0,如果单位球内不知道标签。通过重量衰减正则化和无限神经元的无限数据限制,我们证明存在独特的径向对称的最小化器,其重量衰减正常器和Lipschitz Consance分别成长为$ d $和$ \ sqrt {d} $。 我们此外表明,如果标签$ 1 $强加于半径$ \ varepsilon $,而不仅仅是源头,则重量衰减正规剂会在$ d $中成倍增长。相比之下,具有两个隐藏层的神经网络可以近似目标函数,而不会遇到维度的诅咒。
In this note, we study how neural networks with a single hidden layer and ReLU activation interpolate data drawn from a radially symmetric distribution with target labels 1 at the origin and 0 outside the unit ball, if no labels are known inside the unit ball. With weight decay regularization and in the infinite neuron, infinite data limit, we prove that a unique radially symmetric minimizer exists, whose weight decay regularizer and Lipschitz constant grow as $d$ and $\sqrt{d}$ respectively. We furthermore show that the weight decay regularizer grows exponentially in $d$ if the label $1$ is imposed on a ball of radius $\varepsilon$ rather than just at the origin. By comparison, a neural networks with two hidden layers can approximate the target function without encountering the curse of dimensionality.