微笑：序列到序列域的适应，并最大程度地减少文本图像识别的潜在熵

论文标题

微笑：序列到序列域的适应，并最大程度地减少文本图像识别的潜在熵

SMILE: Sequence-to-Sequence Domain Adaption with Minimizing Latent Entropy for Text Image Recognition

论文作者

Chang, Yen-Cheng, Chen, Yi-Chang, Chang, Yu-Chuan, Yeh, Yi-Ren

论文摘要

具有合成图像的训练识别模型在文本识别中取得了显着的结果。但是，由于综合和现实世界文本图像之间的域移动，识别现实世界图像的文本仍然面临挑战。消除无手动注释的域差异的策略之一是无监督的域适应性（UDA）。由于顺序标记任务的特征，大多数流行的UDA方法不能直接应用于文本识别。为了解决这个问题，我们提出了一种UDA方法，其中最大程度地限制了基于序列的基于注意力的自我评价学习的潜在熵。我们的实验表明，与大多数UDA文本识别基准的现有方法相比，我们提出的框架获得了更好的识别结果。所有代码均可公开使用。

Training recognition models with synthetic images have achieved remarkable results in text recognition. However, recognizing text from real-world images still faces challenges due to the domain shift between synthetic and real-world text images. One of the strategies to eliminate the domain difference without manual annotation is unsupervised domain adaptation (UDA). Due to the characteristic of sequential labeling tasks, most popular UDA methods cannot be directly applied to text recognition. To tackle this problem, we proposed a UDA method with minimizing latent entropy on sequence-to-sequence attention-based models with classbalanced self-paced learning. Our experiments show that our proposed framework achieves better recognition results than the existing methods on most UDA text recognition benchmarks. All codes are publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题