神经微分方程的平均场和动力学描述

论文标题

神经微分方程的平均场和动力学描述

Mean-Field and Kinetic Descriptions of Neural Differential Equations

论文作者

Herty, M., Trimborn, T., Visconti, G.

论文摘要

如今，神经网络在许多应用程序中被广泛用作学习任务的人工智能模型。由于通常是神经网络处理大量数据，因此在平均场和动力学理论中提出它们很方便。在这项工作中，我们专注于特定类别的神经网络，即残留的神经网络，假设每一层都以相同数量的神经元$ n $为特征，该神经元$ n $由数据的维度固定。该假设允许与神经微分方程相比，将残留神经网络解释为时间消化的普通微分方程。然后，在无限的许多输入数据的极限中获得平均场描述。这导致了vlasov型部分微分方程，该方程描述了输入数据分布的演变。我们分析了对网络参数的稳态和敏感性，即权重和偏差。在线性激活函数和一维输入数据的简单设置中，对矩的研究提供了有关网络参数选择的见解。此外，受到随机残留神经网络启发的微观动力学的修改导致网络的Fokker-Planck公式，其中网络训练的概念被拟合分布的任务所取代。执行的分析通过人工数值模拟验证。特别是，提出了有关分类和回归问题的结果。

Nowadays, neural networks are widely used in many applications as artificial intelligence models for learning tasks. Since typically neural networks process a very large amount of data, it is convenient to formulate them within the mean-field and kinetic theory. In this work we focus on a particular class of neural networks, i.e. the residual neural networks, assuming that each layer is characterized by the same number of neurons $N$, which is fixed by the dimension of the data. This assumption allows to interpret the residual neural network as a time-discretized ordinary differential equation, in analogy with neural differential equations. The mean-field description is then obtained in the limit of infinitely many input data. This leads to a Vlasov-type partial differential equation which describes the evolution of the distribution of the input data. We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias. In the simple setting of a linear activation function and one-dimensional input data, the study of the moments provides insights on the choice of the parameters of the network. Furthermore, a modification of the microscopic dynamics, inspired by stochastic residual neural networks, leads to a Fokker-Planck formulation of the network, in which the concept of network training is replaced by the task of fitting distributions. The performed analysis is validated by artificial numerical simulations. In particular, results on classification and regression problems are presented.

下载PDF全文

下载文献需遵守相关版权规定

论文标题