嵌入基于组的参考表达理解的差异化相关性

论文标题

嵌入基于组的参考表达理解的差异化相关性

Differentiated Relevances Embedding for Group-based Referring Expression Comprehension

论文作者

Chen, Fuhai, Ge, Xuri, Sun, Xiaoshuai, Gao, Yue, Liu, Jianzhuang, Chen, Fufeng, Li, Wenjie

论文摘要

引用表达理解的关键在于捕获跨模式的视觉语言相关性。现有作品通常在每个图像中对跨模式相关性建模，在每个图像中，锚对象/表达式及其阳性表达式/对象的属性与负面表达式/对象具有相同的属性，但具有不同的属性值。这些对象/表达式专门用于学习属性的隐式表示，通过一对不同的值，这会阻碍属性表示，表达式/对象表示及其交叉模式相关性的准确性，因为每个锚点对象/表达式通常具有多个属性，而每个属性通常都有多个势值。为此，我们研究了一个名为“基于组的REC”的新型REC问题，在该问题中，每个对象/表达式都可以同时在语义上相似的图像中构造多个三胞胎。为了解决负面因素的爆炸和差异化的相关性分数的差异，我们提出了多组自进度的相关性学习模式，以根据其交叉模式相关性适应具有不同优先级的组内对象表达对。由于平均跨模式相关性各不相同，因此我们进一步设计了跨组相关性的约束，以平衡组优先级的偏见。三个标准REC基准的实验证明了我们方法的有效性和优势。

The key of referring expression comprehension lies in capturing the cross-modal visual-linguistic relevance. Existing works typically model the cross-modal relevance in each image, where the anchor object/expression and their positive expression/object have the same attribute as the negative expression/object, but with different attribute values. These objects/expressions are exclusively utilized to learn the implicit representation of the attribute by a pair of different values, which however impedes the accuracies of the attribute representations, expression/object representations, and their cross-modal relevances since each anchor object/expression usually has multiple attributes while each attribute usually has multiple potential values. To this end, we investigate a novel REC problem named Group-based REC, where each object/expression is simultaneously employed to construct the multiple triplets among the semantically similar images. To tackle the explosion of the negatives and the differentiation of the anchor-negative relevance scores, we propose the multi-group self-paced relevance learning schema to adaptively assign within-group object-expression pairs with different priorities based on their cross-modal relevances. Since the average cross-modal relevance varies a lot across different groups, we further design an across-group relevance constraint to balance the bias of the group priority. Experiments on three standard REC benchmarks demonstrate the effectiveness and superiority of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题