EGAD：通过自我注意力和知识蒸馏进行实时视频流活动的发展图表学习

论文标题

EGAD：通过自我注意力和知识蒸馏进行实时视频流活动的发展图表学习

EGAD: Evolving Graph Representation Learning with Self-Attention and Knowledge Distillation for Live Video Streaming Events

论文作者

Antaris, Stefanos, Rafailidis, Dimitrios, Girdzijauskas, Sarunas

论文摘要

在这项研究中，我们在加权图上提出了动态图表学习模型，以准确预测现场视频流中观众之间连接的网络能力。我们提出了一种神经网络体系结构Egad，旨在通过引入有关连续图卷积网络之间权重的自我发项机制来捕获图的演变。此外，我们考虑了一个事实，即神经体系结构需要大量参数才能训练，从而增加了在线推断潜伏期并在实时视频流活动中对用户体验产生负面影响。为了解决大量参数的高线在线推断问题，我们提出了一种知识蒸馏策略。特别是，我们设计了一个蒸馏损失功能，旨在首先在离线数据上预定教师模型，然后将知识从教师转移到具有较少参数的较小学生模型。我们在三个现实世界数据集上的链接预测任务上评估了我们的建议模型，该模型由实时视频流活动生成。事件持续了80分钟，每个观众都利用了公司Hive流媒体AB提供的分销解决方案。在针对最新方法进行评估时，实验证明了所提出的模型在链路预测准确性和所需参数的数量方面的有效性。此外，我们以不同的蒸馏策略的压缩比研究了所提出的模型的蒸馏性能，在这里我们表明该模型可以达到高达15：100的压缩比，从而保持高链路预测的准确性。出于复制目的，我们的评估数据集和实施可在https://stefanosantaris.github.io/egad上公开获得。

In this study, we present a dynamic graph representation learning model on weighted graphs to accurately predict the network capacity of connections between viewers in a live video streaming event. We propose EGAD, a neural network architecture to capture the graph evolution by introducing a self-attention mechanism on the weights between consecutive graph convolutional networks. In addition, we account for the fact that neural architectures require a huge amount of parameters to train, thus increasing the online inference latency and negatively influencing the user experience in a live video streaming event. To address the problem of the high online inference of a vast number of parameters, we propose a knowledge distillation strategy. In particular, we design a distillation loss function, aiming to first pretrain a teacher model on offline data, and then transfer the knowledge from the teacher to a smaller student model with less parameters. We evaluate our proposed model on the link prediction task on three real-world datasets, generated by live video streaming events. The events lasted 80 minutes and each viewer exploited the distribution solution provided by the company Hive Streaming AB. The experiments demonstrate the effectiveness of the proposed model in terms of link prediction accuracy and number of required parameters, when evaluated against state-of-the-art approaches. In addition, we study the distillation performance of the proposed model in terms of compression ratio for different distillation strategies, where we show that the proposed model can achieve a compression ratio up to 15:100, preserving high link prediction accuracy. For reproduction purposes, our evaluation datasets and implementation are publicly available at https://stefanosantaris.github.io/EGAD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题