论文标题

更改点检测方法用于在线控制未知时间变化的动态系统

Change Point Detection Approach for Online Control of Unknown Time Varying Dynamical Systems

论文作者

Muthirayan, Deepan, Du, Ruijie, Shen, Yanning, Khargonekar, Pramod P.

论文摘要

我们为未知时间变化的动态系统提供了一种新颖的变更点检测方法,该方法具有完整信息反馈(状态,干扰和成本反馈)的全部信息反馈(状态,干扰和成本反馈)。我们表明,我们的算法可以在干扰行动控制(DAC)策略(DAC)策略上实现次线性遗憾,这些策略是一类广泛研究的政策,用于在线控制动态系统的策略,用于任何子线性的变化和非常通用的系统类别:(i)与一般CONVEX成本功能相匹配的干扰系统(II)具有一般性功能,(II)具有一般性的成本功能。具体来说,对于这些类别的系统,可以实现$γ_t^{1/5} t^{4/5} $的(动态)遗憾,其中$γ_T$是基础系统的变化数量,$ t $是控制剧集的持续时间。也就是说,更改点检测方法对任何子线性数量的更改都产生了子线性遗憾,例如其他以前的算法(例如\ cite {minasyan2021online}}不能。 从数值上讲,我们证明了变更点检测方法优于标准重新启动方法\ cite {minasyan2021online},并且是时间不变的动态系统的标准在线学习方法。我们的工作为未知的时变动力系统提供了第一个遗憾保证,它的变异性概念(例如基础系统的变化数量)。我们的工作向声明和输出反馈控制者的扩展是未来工作的主题。

We propose a novel change point detection approach for online learning control with full information feedback (state, disturbance, and cost feedback) for unknown time-varying dynamical systems. We show that our algorithm can achieve a sub-linear regret with respect to the class of Disturbance Action Control (DAC) policies, which are a widely studied class of policies for online control of dynamical systems, for any sub-linear number of changes and very general class of systems: (i) matched disturbance system with general convex cost functions, (ii) general system with linear cost functions. Specifically, a (dynamic) regret of $Γ_T^{1/5}T^{4/5}$ can be achieved for these class of systems, where $Γ_T$ is the number of changes of the underlying system and $T$ is the duration of the control episode. That is, the change point detection approach achieves a sub-linear regret for any sub-linear number of changes, which other previous algorithms such as in \cite{minasyan2021online} cannot. Numerically, we demonstrate that the change point detection approach is superior to a standard restart approach \cite{minasyan2021online} and to standard online learning approaches for time-invariant dynamical systems. Our work presents the first regret guarantee for unknown time-varying dynamical systems in terms of a stronger notion of variability like the number of changes in the underlying system. The extension of our work to state and output feedback controllers is a subject of future work.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源