瓶中的语言：语言模型指导的概念瓶颈用于可解释的图像分类

论文标题

瓶中的语言：语言模型指导的概念瓶颈用于可解释的图像分类

Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification

论文作者

Yang, Yue, Panagopoulou, Artemis, Zhou, Shenghao, Jin, Daniel, Callison-Burch, Chris, Yatskar, Mark

论文摘要

概念瓶颈模型（CBM）是可解释的模型，可以将决策模拟为人类可读概念。他们允许人们轻松理解为什么模型失败，这是高风险应用程序的关键功能。 CBM需要手动指定的概念，并且经常表现出黑匣子的表现，从而阻止了他们的广泛采用。我们解决了这些缺点，并首先展示了如何构建高性能CBM，而无需手动规范与黑匣子模型相似的精度。我们的方法，语言指导的瓶颈（Labo），利用语言模型GPT-3来定义大量可能的瓶颈。给定问题域，Labo使用GPT-3来制作有关类别的事实句子，以形成候选概念。 Labo通过一种新型的suspodular实用程序来有效地搜索可能的瓶颈，该实用程序促进了歧视性和多样化信息的选择。最终，GPT-3的句子概念可以使用剪辑对齐图像，以形成瓶颈层。实验表明，Labo是对视觉识别重要的概念的高效先验。在使用11个不同数据集的评估中，Labo瓶颈在几次分类中表现出色：它们比黑匣子线性探针以1 hopt的速度高11.7％，并且与更多数据相当。总体而言，Labo证明，与黑匣子方法相比，可以在类似或更好的性能下广泛使用可解释的模型。

Concept Bottleneck Models (CBM) are inherently interpretable models that factor model decisions into human-readable concepts. They allow people to easily understand why a model is failing, a critical feature for high-stakes applications. CBMs require manually specified concepts and often under-perform their black box counterparts, preventing their broad adoption. We address these shortcomings and are first to show how to construct high-performance CBMs without manual specification of similar accuracy to black box models. Our approach, Language Guided Bottlenecks (LaBo), leverages a language model, GPT-3, to define a large space of possible bottlenecks. Given a problem domain, LaBo uses GPT-3 to produce factual sentences about categories to form candidate concepts. LaBo efficiently searches possible bottlenecks through a novel submodular utility that promotes the selection of discriminative and diverse information. Ultimately, GPT-3's sentential concepts can be aligned to images using CLIP, to form a bottleneck layer. Experiments demonstrate that LaBo is a highly effective prior for concepts important to visual recognition. In the evaluation with 11 diverse datasets, LaBo bottlenecks excel at few-shot classification: they are 11.7% more accurate than black box linear probes at 1 shot and comparable with more data. Overall, LaBo demonstrates that inherently interpretable models can be widely applied at similar, or better, performance than black box approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题