Unrevertext：综合虚幻世界中的现实场景文本图像

论文标题

Unrevertext：综合虚幻世界中的现实场景文本图像

UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World

论文作者

Long, Shangbang, Yao, Cong

论文摘要

合成数据一直是训练场景文本检测和识别模型的关键工具。一方面，合成单词图像已被证明是在训练场景文本识别器中成功替代真实图像的成功替代品。但是，另一方面，场景文本检测器仍然很大程度上依赖大量的手动注释的现实世界图像，这些图像很昂贵。在本文中，我们引入了UnreverText，这是一种有效的图像合成方法，该方法通过3D图形引擎呈现逼真的图像。 3D合成引擎通过渲染场景和整个文本提供了现实的外观，并允许提供更好的文本区域建议，并访问精确的场景信息，例如正常甚至对象网格。全面的实验验证了其对场景文本检测和识别的有效性。我们还生成了一个多语言版本，用于将来研究多语言场景文本检测和识别。此外，我们以案例敏感的方式重新注册了场景文本识别数据集，并包括标点符号以进行更全面的评估。代码和生成的数据集以：https：//github.com/jyouhou/unrealtext/发布。

Synthetic data has been a critical tool for training scene text detection and recognition models. On the one hand, synthetic word images have proven to be a successful substitute for real images in training scene text recognizers. On the other hand, however, scene text detectors still heavily rely on a large amount of manually annotated real-world images, which are expensive. In this paper, we introduce UnrealText, an efficient image synthesis method that renders realistic images via a 3D graphics engine. 3D synthetic engine provides realistic appearance by rendering scene and text as a whole, and allows for better text region proposals with access to precise scene information, e.g. normal and even object meshes. The comprehensive experiments verify its effectiveness on both scene text detection and recognition. We also generate a multilingual version for future research into multilingual scene text detection and recognition. Additionally, we re-annotate scene text recognition datasets in a case-sensitive way and include punctuation marks for more comprehensive evaluations. The code and the generated datasets are released at: https://github.com/Jyouhou/UnrealText/ .

下载PDF全文

下载文献需遵守相关版权规定

论文标题