立体声匹配的内容感知内容的尺度间成本汇总

论文标题

立体声匹配的内容感知内容的尺度间成本汇总

Content-Aware Inter-Scale Cost Aggregation for Stereo Matching

论文作者

Yao, Chengtang, Jia, Yunde, Di, Huijun, Wu, Yuwei, Yu, Lidong

论文摘要

成本汇总是立体声匹配的关键组成部分，以进行高质量的深度估计。大多数方法都使用多尺度处理来下样本成本量以获取适当的上下文信息，但在提高采样时会导致细节丢失。在本文中，我们提出了一种内容感知的尺度成本聚合方法，该方法通过学习动态滤波器的重量根据两个尺度上的左和右视图的内容来适应整个成本量从粗尺度到细尺度。通过跨不同尺度的信息聚集，我们的方法在提高采样时可以实现可靠的细节恢复。此外，提出了一种新型的分解策略，以有效地构建3D滤波器权重并汇总3D成本量，从而大大降低了计算成本。我们首先通过两个量表上的特征图学习2D相似性，然后根据从左右视图的2D相似性来构建3D滤波器权重。之后，我们将整个3D空间分离空间中的聚集分为一维差异空间和2D空间空间的聚集。场景流数据集，Kitti2015和Middlebury的实验结果证明了我们方法的有效性。

Cost aggregation is a key component of stereo matching for high-quality depth estimation. Most methods use multi-scale processing to downsample cost volume for proper context information, but will cause loss of details when upsampling. In this paper, we present a content-aware inter-scale cost aggregation method that adaptively aggregates and upsamples the cost volume from coarse-scale to fine-scale by learning dynamic filter weights according to the content of the left and right views on the two scales. Our method achieves reliable detail recovery when upsampling through the aggregation of information across different scales. Furthermore, a novel decomposition strategy is proposed to efficiently construct the 3D filter weights and aggregate the 3D cost volume, which greatly reduces the computation cost. We first learn the 2D similarities via the feature maps on the two scales, and then build the 3D filter weights based on the 2D similarities from the left and right views. After that, we split the aggregation in a full 3D spatial-disparity space into the aggregation in 1D disparity space and 2D spatial space. Experiment results on Scene Flow dataset, KITTI2015 and Middlebury demonstrate the effectiveness of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题