论文标题
Ontodsumm:基于本体的灾难事件的推文摘要
OntoDSumm : Ontology based Tweet Summarization for Disaster Events
论文作者
论文摘要
Twitter等社交媒体平台的广泛流行吸引了大量用户在灾难中共享实时信息和短暂的情境信息。政府组织,机构和志愿者需要对这些推文进行摘要,以进行有效,快速的灾难响应。但是,大量的推文涌入使手动获得正在进行的事件的精确概述变得困难。为了应对这一挑战,已经提出了几种推文摘要方法。在大多数现有文献中,推文摘要被分为两个步骤的过程,在第一步中,它将推文分类,在第二步中,它选择了每个类别的代表性推文。在文献中发现了解决第一步问题的监督以及无监督的方法。有监督的方法需要大量的标记数据,这些数据和时间都会产生成本和时间。另一方面,由于重叠的关键字,词汇尺寸,对语义含义的理解不足等,无监督的方法无法正确推文。尽管对于摘要的第二步,现有方法应用了不同的排名方法,在这种方法中,这些排名方法是非常通用的,无法计算出对tweet尊重的适当重要性。通过适当的领域知识,这两个问题都可以更好地处理。在本文中,我们通过本体论在这两个步骤中利用现有的领域知识,并提出了一种新颖的灾难摘要方法Ontodsumm。我们使用10种灾难数据集用4种最先进的方法评估了这一建议的方法。评估结果表明,就鲁日-1 F1分数而言,安大略省的表现优于现有方法约为2-66%。
The huge popularity of social media platforms like Twitter attracts a large fraction of users to share real-time information and short situational messages during disasters. A summary of these tweets is required by the government organizations, agencies, and volunteers for efficient and quick disaster response. However, the huge influx of tweets makes it difficult to manually get a precise overview of ongoing events. To handle this challenge, several tweet summarization approaches have been proposed. In most of the existing literature, tweet summarization is broken into a two-step process where in the first step, it categorizes tweets, and in the second step, it chooses representative tweets from each category. There are both supervised as well as unsupervised approaches found in literature to solve the problem of first step. Supervised approaches requires huge amount of labelled data which incurs cost as well as time. On the other hand, unsupervised approaches could not clusters tweet properly due to the overlapping keywords, vocabulary size, lack of understanding of semantic meaning etc. While, for the second step of summarization, existing approaches applied different ranking methods where those ranking methods are very generic which fail to compute proper importance of a tweet respect to a disaster. Both the problems can be handled far better with proper domain knowledge. In this paper, we exploited already existing domain knowledge by the means of ontology in both the steps and proposed a novel disaster summarization method OntoDSumm. We evaluate this proposed method with 4 state-of-the-art methods using 10 disaster datasets. Evaluation results reveal that OntoDSumm outperforms existing methods by approximately 2-66% in terms of ROUGE-1 F1 score.