通过离散的粗糙路径的深神经网络的稳定性

论文标题

通过离散的粗糙路径的深神经网络的稳定性

Stability of Deep Neural Networks via discrete rough paths

论文作者

Bayer, Christian, Friz, Peter K., Tapia, Nikolas

论文摘要

使用粗糙的路径技术，我们就输入数据和（受过训练的）网络权重来提供了深度残留神经网络输出的先验估计。由于训练有素的网络权重通常是非常粗糙的，因此我们建议根据[1,3] $中任何$ p \的训练重量的总$ p $变化来得出稳定性界限。与神经颂歌文献的$ c^1 $ - 理论不同，即使在权重的限制情况下，我们的估计仍然有限，如[Arxiv：2105.12245]中所建议的。从数学上讲，我们将残留神经网络解释为（粗糙）差异方程的解决方案，并根据离散时间签名和粗糙路径理论的最新结果进行分析。

Using rough path techniques, we provide a priori estimates for the output of Deep Residual Neural Networks in terms of both the input data and the (trained) network weights. As trained network weights are typically very rough when seen as functions of the layer, we propose to derive stability bounds in terms of the total $p$-variation of trained weights for any $p\in[1,3]$. Unlike the $C^1$-theory underlying the neural ODE literature, our estimates remain bounded even in the limiting case of weights behaving like Brownian motions, as suggested in [arXiv:2105.12245]. Mathematically, we interpret residual neural network as solutions to (rough) difference equations, and analyse them based on recent results of discrete time signatures and rough path theory.

下载PDF全文

下载文献需遵守相关版权规定

论文标题