多个计划比一个计划要好：多样化的随机计划

论文标题

多个计划比一个计划要好：多样化的随机计划

Multiple Plans are Better than One: Diverse Stochastic Planning

论文作者

Ghasemi, Mahsa, Crafts, Evan Scope, Zhao, Bo, Topcu, Ufuk

论文摘要

在计划问题中，充分建模所需规范通常是具有挑战性的。特别是，在人类机器人的互动中，由于人类的偏好是私人或复杂的模型，因此可能会出现这种困难。因此，由此产生的目标函数只能部分捕获规格并进行优化，从而导致相对于真实规格的性能差。在这一挑战的推动下，我们提出了一个称为多种随机计划的问题，旨在产生一组代表 - 小而多样化的行为，这些行为与已知目标相对于已知目标几乎是最理想的。特别是，该问题旨在计算由马尔可夫决策过程建模的系统的一组多样化和近乎最佳的政策。我们将该问题作为一种受约束的非线性优化提出，我们建议依靠Frank-Wolfe方法的解决方案。然后，我们证明所提出的解决方案将其收敛到固定点，并在几个计划问题中证明了其功效。

In planning problems, it is often challenging to fully model the desired specifications. In particular, in human-robot interaction, such difficulty may arise due to human's preferences that are either private or complex to model. Consequently, the resulting objective function can only partially capture the specifications and optimizing that may lead to poor performance with respect to the true specifications. Motivated by this challenge, we formulate a problem, called diverse stochastic planning, that aims to generate a set of representative -- small and diverse -- behaviors that are near-optimal with respect to the known objective. In particular, the problem aims to compute a set of diverse and near-optimal policies for systems modeled by a Markov decision process. We cast the problem as a constrained nonlinear optimization for which we propose a solution relying on the Frank-Wolfe method. We then prove that the proposed solution converges to a stationary point and demonstrate its efficacy in several planning problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题