Superb @ SLT 2022：自我监督语音表示的概括和效率的挑战

论文标题

Superb @ SLT 2022：自我监督语音表示的概括和效率的挑战

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

论文作者

Feng, Tzu-hsun, Dong, Annie, Yeh, Ching-Feng, Yang, Shu-wen, Lin, Tzu-Quan, Shi, Jiatong, Chang, Kai-Wei, Huang, Zili, Wu, Haibin, Chang, Xuankai, Watanabe, Shinji, Mohamed, Abdelrahman, Li, Shang-Wen, Lee, Hung-yi

论文摘要

我们在SLT 2022上提出了极好的挑战，该挑战旨在学习自我监督的语音表示，以提高性能，概括和效率。挑战基于精湛的基准，并实施指标来衡量自我监管学习（SSL）表示的计算要求，并评估其在各种高级任务中的概括性和性能。精湛的基准测试提供了对流行语音处理任务的全面覆盖，从语音和说话者的识别到音频产生和语义理解。由于SSL对语音社区产生了兴趣并表现出了有希望的结果，我们设想了挑战，通过激励更实用的技术设计超出任务绩效的技术来提升SSL技术的影响。我们总结了本文提交的14个模型的结果。我们还讨论了SSL研究的这些意见和未来方向的主要发现。

We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised speech representation for better performance, generalization, and efficiency. The challenge builds upon the SUPERB benchmark and implements metrics to measure the computation requirements of self-supervised learning (SSL) representation and to evaluate its generalizability and performance across the diverse SUPERB tasks. The SUPERB benchmark provides comprehensive coverage of popular speech processing tasks, from speech and speaker recognition to audio generation and semantic understanding. As SSL has gained interest in the speech community and showed promising outcomes, we envision the challenge to uplevel the impact of SSL techniques by motivating more practical designs of techniques beyond task performance. We summarize the results of 14 submitted models in this paper. We also discuss the main findings from those submissions and the future directions of SSL research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题