论文标题
上肢生物力学模型的强化学习控制
Reinforcement Learning Control of a Biomechanical Model of the Upper Extremity
论文作者
论文摘要
在可以产生的无限运动数量中,通常假定人类选择优化标准(例如最小化移动时间)的人,受到某些运动约束,例如信号依赖性和恒定的运动噪声。尽管到目前为止,这些假设仅用于简化的点质量或平面模型,但我们解决了它们是否可以预测人类上肢的完整骨骼模型中的运动。我们利用右手手指的尖端的瞄准运动来学习使用电动机bab脚的方法来学习控制策略,以朝着随机放置的3D靶标的变化尺寸的3D目标。我们使用最先进的生物力学模型,其中包括七个驱动的自由度。为了应对维度的诅咒,我们使用简化的二阶肌肉模型,以各种自由度而不是单个肌肉作用。结果证实,信号依赖性和恒定运动噪声的假设以及运动时间最小化的目标足以使人类上肢的最先进的骨骼模型重现人类运动的复杂现象,尤其是FITTS的定律和2/3功率定律。该结果支持这样的观念:复杂的人类生物力学系统的控制可以通过一组简单的假设来决定,并且可以轻松学习。
Among the infinite number of possible movements that can be produced, humans are commonly assumed to choose those that optimize criteria such as minimizing movement time, subject to certain movement constraints like signal-dependent and constant motor noise. While so far these assumptions have only been evaluated for simplified point-mass or planar models, we address the question of whether they can predict reaching movements in a full skeletal model of the human upper extremity. We learn a control policy using a motor babbling approach as implemented in reinforcement learning, using aimed movements of the tip of the right index finger towards randomly placed 3D targets of varying size. We use a state-of-the-art biomechanical model, which includes seven actuated degrees of freedom. To deal with the curse of dimensionality, we use a simplified second-order muscle model, acting at each degree of freedom instead of individual muscles. The results confirm that the assumptions of signal-dependent and constant motor noise, together with the objective of movement time minimization, are sufficient for a state-of-the-art skeletal model of the human upper extremity to reproduce complex phenomena of human movement, in particular Fitts' Law and the 2/3 Power Law. This result supports the notion that control of the complex human biomechanical system can plausibly be determined by a set of simple assumptions and can easily be learned.