论文标题
pontryagin的最低原理和前向后扫描方法用于在内存限制的部分可观察到的随机控制中的HJB-FP方程系统
Pontryagin's Minimum Principle and Forward-Backward Sweep Method for the System of HJB-FP Equations in Memory-Limited Partially Observable Stochastic Control
论文作者
论文摘要
在不完整的信息和内存限制下,部分可观察到的可观察到的可观察到的随机控制(ML-POSC)是随机最佳控制问题。为了获得ML-POSC的最佳控制函数,需要求解前向fokker-Planck(FP)方程的系统和向后的Hamilton-Jacobi-Bellman(HJB)方程。在这项工作中,我们首先表明HJB-FP方程的系统可以通过Pontryagin的最低原理来解释概率密度函数空间。基于这种解释,我们将前进的扫描方法(FBSM)提出到ML-POSC,该方法已用于Pontryagin的最低原理。 FBSM是一种交替计算前向FP方程和向后HJB方程的算法。尽管通常不能保证FBSM的收敛性,但在ML-POSC中可以保证,因为HJB-FP方程的耦合仅限于ML-POSC中的最佳控制函数。
Memory-limited partially observable stochastic control (ML-POSC) is the stochastic optimal control problem under incomplete information and memory limitation. In order to obtain the optimal control function of ML-POSC, a system of the forward Fokker-Planck (FP) equation and the backward Hamilton-Jacobi-Bellman (HJB) equation needs to be solved. In this work, we firstly show that the system of HJB-FP equations can be interpreted via the Pontryagin's minimum principle on the probability density function space. Based on this interpretation, we then propose the forward-backward sweep method (FBSM) to ML-POSC, which has been used in the Pontryagin's minimum principle. FBSM is an algorithm to compute the forward FP equation and the backward HJB equation alternately. Although the convergence of FBSM is generally not guaranteed, it is guaranteed in ML-POSC because the coupling of HJB-FP equations is limited to the optimal control function in ML-POSC.