论文标题
对比度估计揭示了主题后验信息到线性模型
Contrastive estimation reveals topic posterior information to linear models
论文作者
论文摘要
对比学习是一种表示形式学习的方法,它利用自然发生的相似和不同的数据点来查找有用的数据嵌入。在主题建模假设下的文档分类的背景下,我们证明对比学习能够恢复文档的表示,这些文档揭示了其基本的主题后验信息到线性模型。我们在半监督的设置中应用此过程,并从经验上证明,这些表示的线性分类器在文档分类任务中的表现良好,而培训示例很少。
Contrastive learning is an approach to representation learning that utilizes naturally occurring similar and dissimilar pairs of data points to find useful embeddings of data. In the context of document classification under topic modeling assumptions, we prove that contrastive learning is capable of recovering a representation of documents that reveals their underlying topic posterior information to linear models. We apply this procedure in a semi-supervised setup and demonstrate empirically that linear classifiers with these representations perform well in document classification tasks with very few training examples.