论文标题

公制学习与分类音乐表示的分类学习

Metric Learning vs Classification for Disentangled Music Representation Learning

论文作者

Lee, Jongpil, Bryan, Nicholas J., Salamon, Justin, Jin, Zeyu, Nam, Juhan

论文摘要

深度表示学习提供了一个强大的范式,用于将输入数据映射到有组织的嵌入空间上,对于许多音乐信息检索任务非常有用。代表学习的两种核心方法包括深度度量学习和分类,既有与学习可以很好地跨越任务概括的表示形式相同的目标。随着概括,新兴的表示形式的新兴概念也引起了人们的极大兴趣,在这些语义概念(例如,类型,情绪,仪器)共同学习,但在学习的表示空间中仍然可以分离。在本文中,我们提出了一个单一的表示学习框架,该框架以整体方式阐明了度量学习,分类和分离之间的关系。为此,我们(1)概述了过去关于公制学习与分类之间关系的工作,(2)通过探索三种不同的学习方法及其分离版本,将这种关系扩展到多标签数据,以及(3)在四个任务(训练时间,相似性检索,自动标记,标签和三重态预测)上评估所有模型。我们发现,基于分类的模型通常对于训练时间,相似性检索和自动标记是有利的,而深度度量学习则表现出更好的三胞胎预测性能。最后,我们表明我们提出的方法为音乐自动标记带来了最先进的结果。

Deep representation learning offers a powerful paradigm for mapping input data onto an organized embedding space and is useful for many music information retrieval tasks. Two central methods for representation learning include deep metric learning and classification, both having the same goal of learning a representation that can generalize well across tasks. Along with generalization, the emerging concept of disentangled representations is also of great interest, where multiple semantic concepts (e.g., genre, mood, instrumentation) are learned jointly but remain separable in the learned representation space. In this paper we present a single representation learning framework that elucidates the relationship between metric learning, classification, and disentanglement in a holistic manner. For this, we (1) outline past work on the relationship between metric learning and classification, (2) extend this relationship to multi-label data by exploring three different learning approaches and their disentangled versions, and (3) evaluate all models on four tasks (training time, similarity retrieval, auto-tagging, and triplet prediction). We find that classification-based models are generally advantageous for training time, similarity retrieval, and auto-tagging, while deep metric learning exhibits better performance for triplet-prediction. Finally, we show that our proposed approach yields state-of-the-art results for music auto-tagging.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源