关于使用语言模型进行具体任务的基础计划

论文标题

关于使用语言模型进行具体任务的基础计划

On Grounded Planning for Embodied Tasks with Language Models

论文作者

Lin, Bill Yuchen, Huang, Chengsong, Liu, Qian, Gu, Wenda, Sommerer, Sam, Ren, Xiang

论文摘要

语言模型（LMS）已经证明了它们在拥有对物理世界的常识知识方面的能力，这是日常生活中执行任务的关键方面。但是，尚不清楚** LMS是否有能力为具体任务生成扎根，可执行的计划。在本文中，我们解决了这个重要的研究问题，并提出了对该主题的首次调查。我们的新型问题表述，名为** g-planet **，输入一个高级目标和有关对象在特定环境中的数据表，然后输出一个逐步可行的计划，以供机器人剂遵循。为了促进研究，我们建立了一个**评估协议**，并设计一个专门的指标来评估计划的质量。我们的实验表明，使用表用于编码环境和迭代解码策略可以显着增强LMS在基础计划中的能力。我们的分析还揭示了有趣和非平凡的发现。

Language models (LMs) have demonstrated their capability in possessing commonsense knowledge of the physical world, a crucial aspect of performing tasks in everyday life. However, it remains unclear **whether LMs have the capacity to generate grounded, executable plans for embodied tasks.** This is a challenging task as LMs lack the ability to perceive the environment through vision and feedback from the physical environment. In this paper, we address this important research question and present the first investigation into the topic. Our novel problem formulation, named **G-PlanET**, inputs a high-level goal and a data table about objects in a specific environment, and then outputs a step-by-step actionable plan for a robotic agent to follow. To facilitate the study, we establish an **evaluation protocol** and design a dedicated metric to assess the quality of the plans. Our experiments demonstrate that the use of tables for encoding the environment and an iterative decoding strategy can significantly enhance the LMs' ability in grounded planning. Our analysis also reveals interesting and non-trivial findings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题