论文标题
深度学习立体声远景
Deep Learning Stereo Vision at the edge
论文作者
论文摘要
我们概述了用于构建适合芯片系统的新的立体声视觉解决方案的方法。这种新的解决方案的开发是为了将计算机视觉能力带入生活在电源约束环境中的嵌入式设备。该解决方案是在经典立体声视觉技术和深度学习方法之间建立的。立体模块由两个单独的模块组成:一个加速我们训练的神经网络,一个加速前端部分。该系统是完全被动的,不需要任何结构化的光来获得非常引人注目的精度。关于以前的立体声愿景解决方案,我们提供的行业提供了重大改进,这是噪音的稳健性。这主要是由于所选体系结构的深度学习部分。我们将结果提交给米德尔伯里数据集挑战赛。目前,它是芯片解决方案上最好的系统。该系统是针对低延迟应用程序开发的,这些应用需要在高清视频上比实时性能更好。
We present an overview of the methodology used to build a new stereo vision solution that is suitable for System on Chip. This new solution was developed to bring computer vision capability to embedded devices that live in a power constrained environment. The solution is constructured as a hybrid between classical Stereo Vision techniques and deep learning approaches. The stereoscopic module is composed of two separate modules: one that accelerates the neural network we trained and one that accelerates the front-end part. The system is completely passive and does not require any structured light to obtain very compelling accuracy. With respect to the previous Stereo Vision solutions offered by the industries we offer a major improvement is robustness to noise. This is mainly possible due to the deep learning part of the chosen architecture. We submitted our result to Middlebury dataset challenge. It currently ranks as the best System on Chip solution. The system has been developed for low latency applications which require better than real time performance on high definition videos.