论文标题
连续时间随机梯度下降,以优化随机微分方程的固定分布
Continuous-time stochastic gradient descent for optimizing over the stationary distribution of stochastic differential equations
论文作者
论文摘要
我们开发了一种新的连续时间随机梯度下降方法,用于优化随机微分方程(SDE)模型的固定分布。该算法使用固定分布梯度的估计值不断更新SDE模型的参数。使用SDE状态衍生物的正向传播同时更新梯度估计值,渐近地融合到了最陡的下降方向。我们严格地证明了线性SDE模型的在线正向传播算法(即多维Ornstein-uhlenbeck过程)的收敛性,并为非线性示例提供了其数值结果。证明需要分析围绕最陡下降方向的参数演化的波动。由于算法的在线性质,波动上的界限很难获得(例如,随着参数的变化,固定分布将不断变化)。我们证明了新类泊松部分微分方程(PDE)的解决方案的界限,然后将其用于分析算法中的参数波动。我们的算法适用于一系列数学融资应用,涉及SDE模型的统计校准以及长期长期范围内的随机最佳控制,在这些范围内,数据的奇迹性和随机过程是合适的建模框架。数值示例探讨了这些潜在应用,包括学习神经网络控制,以对SDE的高维最佳控制和训练限制订单簿事件的随机点过程模型。
We develop a new continuous-time stochastic gradient descent method for optimizing over the stationary distribution of stochastic differential equation (SDE) models. The algorithm continuously updates the SDE model's parameters using an estimate for the gradient of the stationary distribution. The gradient estimate is simultaneously updated using forward propagation of the SDE state derivatives, asymptotically converging to the direction of steepest descent. We rigorously prove convergence of the online forward propagation algorithm for linear SDE models (i.e., the multi-dimensional Ornstein-Uhlenbeck process) and present its numerical results for nonlinear examples. The proof requires analysis of the fluctuations of the parameter evolution around the direction of steepest descent. Bounds on the fluctuations are challenging to obtain due to the online nature of the algorithm (e.g., the stationary distribution will continuously change as the parameters change). We prove bounds for the solutions of a new class of Poisson partial differential equations (PDEs), which are then used to analyze the parameter fluctuations in the algorithm. Our algorithm is applicable to a range of mathematical finance applications involving statistical calibration of SDE models and stochastic optimal control for long time horizons where ergodicity of the data and stochastic process is a suitable modeling framework. Numerical examples explore these potential applications, including learning a neural network control for high-dimensional optimal control of SDEs and training stochastic point process models of limit order book events.