设备文本图像超级分辨率

论文标题

设备文本图像超级分辨率

On-Device Text Image Super Resolution

论文作者

Jain, Dhruval, Prabhu, Arun D, Ramena, Gopi, Goyal, Manoj, Mohanty, Debi Prasanna, Moharana, Sukumar, Purre, Naresh

论文摘要

最新的超级分辨率研究（SR）目睹了深度卷积神经网络的进步。需要从风景秀丽的文本图像甚至设备上的文档图像中提取信息，其中大多数是低分辨率（LR）图像。因此，SR成为基本的预处理步骤，因为在智能手机中传统上存在的双色UPPRAPLING在LR图像上的表现较差。为了使用户更多地控制他的隐私，并通过减少云计算的开销和GPU使用时间的开销来减少碳足迹，在最近的时间里，执行SR模型是必不可少的。在运行和优化智能手机等资源受限平台上的模型方面存在各种挑战。在本文中，我们提出了一个新颖的深神经网络，该网络可重建更清晰的角色边缘，从而提高OCR的信心。所提出的体系结构不仅可以在各种基准数据集上的双色上取样方面取得显着改善，而且每图像的平均推理时间为11.7 ms。在Text330数据集上，我们的表现优于最先进。我们还可以在ICDAR 2015 TextSR数据集上获得75.89％的OCR准确性，该数据集的准确度为78.10％。

Recent research on super-resolution (SR) has witnessed major developments with the advancements of deep convolutional neural networks. There is a need for information extraction from scenic text images or even document images on device, most of which are low-resolution (LR) images. Therefore, SR becomes an essential pre-processing step as Bicubic Upsampling, which is conventionally present in smartphones, performs poorly on LR images. To give the user more control over his privacy, and to reduce the carbon footprint by reducing the overhead of cloud computing and hours of GPU usage, executing SR models on the edge is a necessity in the recent times. There are various challenges in running and optimizing a model on resource-constrained platforms like smartphones. In this paper, we present a novel deep neural network that reconstructs sharper character edges and thus boosts OCR confidence. The proposed architecture not only achieves significant improvement in PSNR over bicubic upsampling on various benchmark datasets but also runs with an average inference time of 11.7 ms per image. We have outperformed state-of-the-art on the Text330 dataset. We also achieve an OCR accuracy of 75.89% on the ICDAR 2015 TextSR dataset, where ground truth has an accuracy of 78.10%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题