论文标题
利用语义角色上下文化的视频特征,用于多个实体文本视频检索epic-kitchens-100多实体检索挑战2022
Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022
论文作者
论文摘要
在本报告中,我们介绍了2022年Epic-kitchens-100多现实检索挑战的方法。我们首先将句子分解为与动词和名词相对应的语义角色。然后,利用自我攻击来利用语义角色上下文化的视频特征以及通过多个嵌入空间中的三重损失的文本功能。我们的方法在归一化折扣累积增益(NDCG)中覆盖了强大的基线,这对于语义相似性更有价值。我们的提交为NDCG排名第三,在MAP中排名第四。
In this report, we present our approach for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022. We first parse sentences into semantic roles corresponding to verbs and nouns; then utilize self-attentions to exploit semantic role contextualized video features along with textual features via triplet losses in multiple embedding spaces. Our method overpasses the strong baseline in normalized Discounted Cumulative Gain (nDCG), which is more valuable for semantic similarity. Our submission is ranked 3rd for nDCG and ranked 4th for mAP.