跨模式多任务学习用于漫画面的图形识别

论文标题

跨模式多任务学习用于漫画面的图形识别

Cross-modal Multi-task Learning for Graphic Recognition of Caricature Face

论文作者

Ming, Zuheng, Burie, Jean-Christophe, Luqman, Muhammad Muzzamil

论文摘要

对逼真的视觉图像的面部识别已经进行了充分的研究，并在最近的十年中取得了重大进展。与逼真的视觉图像不同，漫画的面部识别远非视觉图像的性能。这在很大程度上是由于夸大面部特征以增强角色而引入的漫画的极端非刚性扭曲。漫画的异质方式和视觉图像结果漫画 - 视觉识别是一个跨模式的问题。在本文中，我们提出了一种通过多任务学习进行漫画 - 视觉识别的方法。这项工作不是传统的多任务学习，而是提出了一种根据任务的重要性来学习任务权重的方法。提出的带有动态任务的多任务学习使能够适当训练艰巨的任务和简单的任务，而不是被卡在过度训练的简单任务中。实验结果证明了拟议的动态多任务学习对跨模式漫画 - 视觉识别的有效性。数据集Cavi和Webcarication上的性能显示出优于制作方法的优势。

Face recognition of realistic visual images has been well studied and made a significant progress in the recent decade. Unlike the realistic visual images, the face recognition of the caricatures is far from the performance of the visual images. This is largely due to the extreme non-rigid distortions of the caricatures introduced by exaggerating the facial features to strengthen the characters. The heterogeneous modalities of the caricatures and the visual images result the caricature-visual face recognition is a cross-modal problem. In this paper, we propose a method to conduct caricature-visual face recognition via multi-task learning. Rather than the conventional multi-task learning with fixed weights of tasks, this work proposes an approach to learn the weights of tasks according to the importance of tasks. The proposed multi-task learning with dynamic tasks weights enables to appropriately train the hard task and easy task instead of being stuck in the over-training easy task as conventional methods. The experimental results demonstrate the effectiveness of the proposed dynamic multi-task learning for cross-modal caricature-visual face recognition. The performances on the datasets CaVI and WebCaricature show the superiority over the state-of-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题