实体模型的视觉接地

论文标题

实体模型的视觉接地

Visual Grounding of Learned Physical Models

论文作者

Li, Yunzhu, Lin, Toru, Yi, Kexin, Bear, Daniel M., Yamins, Daniel L. K., Wu, Jiajun, Tenenbaum, Joshua B., Torralba, Antonio

论文摘要

人类即使对象参与复杂的相互作用，人类即使对象的身体特性也可以预测其运动。执行物理推理和适应新环境的能力，而对人类的内在环境，对最新的计算模型仍然具有挑战性。在这项工作中，我们提出了一个神经模型，该模型同时推荐了物理学，并基于视觉和动态先验做出未来的预测。视觉先验可以通过视觉观测来预测系统的基于粒子的表示。推理模块在这些粒子上运行，预测和精炼粒子位置，对象状态和物理参数的估计值，但要受到先验动力学提出的约束，我们将其称为视觉接地。我们证明了方法在涉及刚性物体，可变形材料和流体的环境中的有效性。实验表明，我们的模型可以在一些观测值中推断出物理性能，从而使该模型可以快速适应看不见的场景并对未来进行准确的预测。

Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions. The abilities to perform physical reasoning and to adapt to new environments, while intrinsic to humans, remain challenging to state-of-the-art computational models. In this work, we present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors. The visual prior predicts a particle-based representation of the system from visual observations. An inference module operates on those particles, predicting and refining estimates of particle locations, object states, and physical parameters, subject to the constraints imposed by the dynamics prior, which we refer to as visual grounding. We demonstrate the effectiveness of our method in environments involving rigid objects, deformable materials, and fluids. Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题