论文标题
用于自然主义动词学习的视觉空间数据集
A Visuospatial Dataset for Naturalistic Verb Learning
论文作者
论文摘要
我们引入了一个新的数据集,用于培训和评估基础语言模型。我们的数据是在虚拟现实环境中收集的,旨在模拟言语前儿童可能可以访问的语言数据的质量:也就是说,自然,自发的语音与扎实的视觉空间环境配对。我们使用收集的数据比较几种用于动词学习的分布语义模型。我们基于2D(像素)功能以及基于3D(符号,空间)特征的功能工程模型评估神经模型,并表明既没有建模方法可以达到令人满意的性能。我们的结果与儿童语言获取的证据一致,这些证据强调了从天真的分布数据中学习动词的难度。我们讨论了未来在认知启发的基础语言学习方面工作的途径,并以促进该主题的研究释放我们的语料库。
We introduce a new dataset for training and evaluating grounded language models. Our data is collected within a virtual reality environment and is designed to emulate the quality of language data to which a pre-verbal child is likely to have access: That is, naturalistic, spontaneous speech paired with richly grounded visuospatial context. We use the collected data to compare several distributional semantics models for verb learning. We evaluate neural models based on 2D (pixel) features as well as feature-engineered models based on 3D (symbolic, spatial) features, and show that neither modeling approach achieves satisfactory performance. Our results are consistent with evidence from child language acquisition that emphasizes the difficulty of learning verbs from naive distributional data. We discuss avenues for future work on cognitively-inspired grounded language learning, and release our corpus with the intent of facilitating research on the topic.