论文标题

协调的主题建模

Coordinated Topic Modeling

论文作者

Akash, Pritom Saha, Huang, Jie, Chang, Kevin Chen-Chuan

论文摘要

我们提出了一个称为协调主题建模的新问题,该问题在描述文本语料库时模仿人类行为。它考虑了一组定义明确的主题,例如带有参考表示的语义空间的轴。然后,它使用轴模拟语料库,以易于理解的表示。这项新任务通过重复现有的知识和有益于Corpora比较任务来有助于更解释的语料库。我们设计ECTM是一种基于嵌入的协调主题模型,该模型有效地使用参考表示形式来捕获特定于目标语料库的方面,同时维护每个主题的全局语义。在ECTM中,我们通过自训练的机制介绍了主题和文档级别的监督,以解决该问题。最后,在多个领域进行的广泛实验表明,我们模型的优越性比其他基线的优越性。

We propose a new problem called coordinated topic modeling that imitates human behavior while describing a text corpus. It considers a set of well-defined topics like the axes of a semantic space with a reference representation. It then uses the axes to model a corpus for easily understandable representation. This new task helps represent a corpus more interpretably by reusing existing knowledge and benefits the corpora comparison task. We design ECTM, an embedding-based coordinated topic model that effectively uses the reference representation to capture the target corpus-specific aspects while maintaining each topic's global semantics. In ECTM, we introduce the topic- and document-level supervision with a self-training mechanism to solve the problem. Finally, extensive experiments on multiple domains show the superiority of our model over other baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源