论文标题
深度学习中的协同和对称性:数据,模型和推理算法之间的相互作用
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
论文作者
论文摘要
尽管通常认为在高维度中学习受到维数的诅咒,但现代机器学习方法通常具有惊人的力量,可以解决广泛的挑战性现实世界学习问题而无需使用大量数据。这些方法如何打破这种诅咒仍然是深度学习理论中的一个基本开放问题。尽管以前的努力通过研究数据(D),模型(M)和推理算法(i)作为独立模块来调查了这个问题,但在本文中,我们将三胞胎(D,M,I)分析为集成系统,并确定有助于减轻维度诅咒的重要协同作用。我们首先研究了与各种学习算法(M,I)相关的基本对称性,重点是深度学习中的四个原型体系结构:完全连接的网络(FCN),本地连接的网络(LCN)和卷积网络,并带有和不合并(GAP/VEC)。我们发现,当这些对称性与数据分布的对称性兼容时,学习是最有效的,并且当(d,m,i)三胞胎的任何成员不一致或次优时,性能会显着恶化。
Although learning in high dimensions is commonly believed to suffer from the curse of dimensionality, modern machine learning methods often exhibit an astonishing power to tackle a wide range of challenging real-world learning problems without using abundant amounts of data. How exactly these methods break this curse remains a fundamental open question in the theory of deep learning. While previous efforts have investigated this question by studying the data (D), model (M), and inference algorithm (I) as independent modules, in this paper, we analyze the triplet (D, M, I) as an integrated system and identify important synergies that help mitigate the curse of dimensionality. We first study the basic symmetries associated with various learning algorithms (M, I), focusing on four prototypical architectures in deep learning: fully-connected networks (FCN), locally-connected networks (LCN), and convolutional networks with and without pooling (GAP/VEC). We find that learning is most efficient when these symmetries are compatible with those of the data distribution and that performance significantly deteriorates when any member of the (D, M, I) triplet is inconsistent or suboptimal.