通过深二进制哈希和量化有效的跨模式检索

论文标题

通过深二进制哈希和量化有效的跨模式检索

Efficient Cross-Modal Retrieval via Deep Binary Hashing and Quantization

论文作者

Shi, Yang, Chung, Young-joo

论文摘要

跨模式检索旨在搜索具有不同内容方式的语义含义相似的数据。但是，跨模式检索需要大量的存储时间和检索时间，因为它需要以多种方式处理数据。现有的作品着重于学习单源紧凑型功能，例如二进制哈希码，这些特征可以保持不同方式之间的相似性。在这项工作中，我们提出了一个共同学习的深层散布和量化网络（HQ），以进行跨模式检索。我们同时学习二进制哈希代码和量化代码，以通过端到端的深度学习体系结构以多种方式保留语义信息。在检索步骤中，使用二进制哈希来从搜索空间中检索一部分项目，然后使用量化来重新排列检索的项目。从理论上讲，我们从理论上和经验表明，这种两阶段的检索方法在保持准确性的同时提供了更快的检索结果。与基于监督的神经网络的紧凑型编码模型相比，在整个NUS，Mir-Flickr和Amazon数据集的实验结果表明，总部的精度增长了7％以上。

Cross-modal retrieval aims to search for data with similar semantic meanings across different content modalities. However, cross-modal retrieval requires huge amounts of storage and retrieval time since it needs to process data in multiple modalities. Existing works focused on learning single-source compact features such as binary hash codes that preserve similarities between different modalities. In this work, we propose a jointly learned deep hashing and quantization network (HQ) for cross-modal retrieval. We simultaneously learn binary hash codes and quantization codes to preserve semantic information in multiple modalities by an end-to-end deep learning architecture. At the retrieval step, binary hashing is used to retrieve a subset of items from the search space, then quantization is used to re-rank the retrieved items. We theoretically and empirically show that this two-stage retrieval approach provides faster retrieval results while preserving accuracy. Experimental results on the NUS-WIDE, MIR-Flickr, and Amazon datasets demonstrate that HQ achieves boosts of more than 7% in precision compared to supervised neural network-based compact coding models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题