CLEVR解析器：用于语言接地图像场景的几何学习的图形解析器库

论文标题

CLEVR解析器：用于语言接地图像场景的几何学习的图形解析器库

CLEVR Parser: A Graph Parser Library for Geometric Learning on Language Grounded Image Scenes

论文作者

Saqur, Raeid, Deshpande, Ameet

论文摘要

CLEVR数据集已广泛用于机器学习（ML）和自然语言处理（NLP）域中的语言接地视觉推理。我们提供了一个用于CLEVR的图形解析器库，该库为以对象为中心的属性和关系提取，并为双重模式提供结构图表示。结构性订单不变表示能够几何学习，并可以帮助下游任务，例如语言接地，机器人技术，组成性，可解释性和计算语法构建。我们提供三个可扩展的主要组件 - 解析器，嵌入式和可视化器，可以量身定制以适合特定的学习设置。我们还提供与流行的深图神经网络（GNN）库无缝集成的开箱即用功能。此外，我们讨论了图书馆的下游使用和应用，以及它如何加速NLP研究社区的研究。

The CLEVR dataset has been used extensively in language grounded visual reasoning in Machine Learning (ML) and Natural Language Processing (NLP) domains. We present a graph parser library for CLEVR, that provides functionalities for object-centric attributes and relationships extraction, and construction of structural graph representations for dual modalities. Structural order-invariant representations enable geometric learning and can aid in downstream tasks like language grounding to vision, robotics, compositionality, interpretability, and computational grammar construction. We provide three extensible main components - parser, embedder, and visualizer that can be tailored to suit specific learning setups. We also provide out-of-the-box functionality for seamless integration with popular deep graph neural network (GNN) libraries. Additionally, we discuss downstream usage and applications of the library, and how it accelerates research for the NLP research community.

下载PDF全文

下载文献需遵守相关版权规定

论文标题