艺术的正式分析：从风格到语言模型的视觉概念的代理学习

论文标题

艺术的正式分析：从风格到语言模型的视觉概念的代理学习

Formal Analysis of Art: Proxy Learning of Visual Concepts from Style Through Language Models

论文作者

Kim, Diana, Elgammal, Ahmed, Mazzone, Marian

论文摘要

我们提出了一个机器学习系统，该系统可以用一组视觉元素和艺术原理量化美术绘画。这种形式的分析对于理解艺术是基本的，但是开发这样的系统是具有挑战性的。绘画具有很高的视觉复杂性，但也很难用直接标签收集足够的训练数据。为了解决这些实际的局限性，我们引入了一种名为代理学习的新颖机制，该机制通过绘画中的风格的一般关系来学习绘画中的视觉概念。该框架不需要任何视觉注释，而仅使用样式标签和视觉概念和样式之间的一般关系。在本文中，我们提出了一个新颖的代理模型，并在代理学习的背景下重新制定了四种先前存在的方法。通过定量和定性的比较，我们评估了这些方法，并比较了它们在量化艺术视觉概念方面的有效性，其中一般关系是通过语言模型估算的；手套或伯特。语言建模是一种实用且可扩展的解决方案，不需要标签，但不可避免地是不完美的。我们演示了新的代理模型如何对缺陷进行鲁棒性，而其他模型则受其敏感的影响。

We present a machine learning system that can quantify fine art paintings with a set of visual elements and principles of art. This formal analysis is fundamental for understanding art, but developing such a system is challenging. Paintings have high visual complexities, but it is also difficult to collect enough training data with direct labels. To resolve these practical limitations, we introduce a novel mechanism, called proxy learning, which learns visual concepts in paintings though their general relation to styles. This framework does not require any visual annotation, but only uses style labels and a general relationship between visual concepts and style. In this paper, we propose a novel proxy model and reformulate four pre-existing methods in the context of proxy learning. Through quantitative and qualitative comparison, we evaluate these methods and compare their effectiveness in quantifying the artistic visual concepts, where the general relationship is estimated by language models; GloVe or BERT. The language modeling is a practical and scalable solution requiring no labeling, but it is inevitably imperfect. We demonstrate how the new proxy model is robust to the imperfection, while the other models are sensitively affected by it.

下载PDF全文

下载文献需遵守相关版权规定

论文标题