C^2：通过在线耦合和离线增强学习的并发网络的机器人共同设计

论文标题

C^2：通过在线耦合和离线增强学习的并发网络的机器人共同设计

C^2:Co-design of Robots via Concurrent Networks Coupling Online and Offline Reinforcement Learning

论文作者

Chen, Ci, Xiang, Pingyu, Lu, Haojian, Wang, Yue, Xiong, Rong

论文摘要

随着计算能力的增加，使用数据驱动的方法共同设计机器人的形态和控制器已成为一种有希望的方法。但是，大多数现有数据驱动的方法都需要训练每个形态的控制器来计算适应性，这是耗时的。相比之下，双网络框架利用在特定形态下收集的单个网络收集的数据来训练人群网络，该网络为形态优化提供了替代功能。这种方法取代了对各种候选人的传统评估，从而加快了培训。尽管结果相当大，但对两个网络的在线培训都阻碍了他们的性能。为了解决此问题，我们提出了一个并发网络框架，将在线和离线增强学习（RL）方法结合在一起。通过以灵活的方式利用行为克隆项，我们实现了两个网络的有效组合。我们在模拟器中进行了多组比较实验，发现所提出的方法有效地解决了双网框架中存在的问题，从而导致整体算法性能改善。此外，我们在实际机器人上验证了该算法，证明了其在实际应用中的可行性。

With the increasing computing power, using data-driven approaches to co-design a robot's morphology and controller has become a promising way. However, most existing data-driven methods require training the controller for each morphology to calculate fitness, which is time-consuming. In contrast, the dual-network framework utilizes data collected by individual networks under a specific morphology to train a population network that provides a surrogate function for morphology optimization. This approach replaces the traditional evaluation of a diverse set of candidates, thereby speeding up the training. Despite considerable results, the online training of both networks impedes their performance. To address this issue, we propose a concurrent network framework that combines online and offline reinforcement learning (RL) methods. By leveraging the behavior cloning term in a flexible manner, we achieve an effective combination of both networks. We conducted multiple sets of comparative experiments in the simulator and found that the proposed method effectively addresses issues present in the dual-network framework, leading to overall algorithmic performance improvement. Furthermore, we validated the algorithm on a real robot, demonstrating its feasibility in a practical application.

下载PDF全文

下载文献需遵守相关版权规定

论文标题