论文标题

繁殖性别感知的直接语音翻译系统

Breeding Gender-aware Direct Speech Translation Systems

论文作者

Gaido, Marco, Savoldi, Beatrice, Bentivogli, Luisa, Negri, Matteo, Turchi, Marco

论文摘要

在自动语音翻译(ST)中,涉及单独转录和翻译步骤的传统级联方法为竞争​​日益激烈,更强大的直接解决方案提供了基础。特别是,通过翻译无中间转录的语音音频数据,直接ST模型能够利用和保留输入中存在的基本信息(例如说话者的声音特征),这些信息否则会在Cascade框架中丢失。尽管这种能力被证明对性别翻译很有用,但直接ST仍会受到性别偏见的影响,就像其级联对应物以及机器翻译和许多其他自然语言处理应用一样。此外,直接依靠声带生物识别特征作为性别提示的直接ST系统可能对某些用户不适合且可能有害。超越了语音信号,在本文中,我们比较了不同的方法,以告知Direct ST模型有关说话者的性别,并测试他们处理从英语到意大利语和法语的性别翻译的能力。为此,我们用扬声器的性别信息手动注释了大型数据集,并将它们用于反映不同现实世界情景的实验。我们的结果表明,性别感知到的直接ST解决方案可以显着胜过强劲的(但性别 - 统一 - 直接ST模型)。特别是,性别标记的单词的翻译可以增加30点的准确性,同时保留整体翻译质量。

In automatic speech translation (ST), traditional cascade approaches involving separate transcription and translation steps are giving ground to increasingly competitive and more robust direct solutions. In particular, by translating speech audio data without intermediate transcription, direct ST models are able to leverage and preserve essential information present in the input (e.g. speaker's vocal characteristics) that is otherwise lost in the cascade framework. Although such ability proved to be useful for gender translation, direct ST is nonetheless affected by gender bias just like its cascade counterpart, as well as machine translation and numerous other natural language processing applications. Moreover, direct ST systems that exclusively rely on vocal biometric features as a gender cue can be unsuitable and potentially harmful for certain users. Going beyond speech signals, in this paper we compare different approaches to inform direct ST models about the speaker's gender and test their ability to handle gender translation from English into Italian and French. To this aim, we manually annotated large datasets with speakers' gender information and used them for experiments reflecting different possible real-world scenarios. Our results show that gender-aware direct ST solutions can significantly outperform strong - but gender-unaware - direct ST models. In particular, the translation of gender-marked words can increase up to 30 points in accuracy while preserving overall translation quality.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源