语义通信的性能优化：一种基于注意力的强化学习方法

论文标题

语义通信的性能优化：一种基于注意力的强化学习方法

Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach

论文作者

Wang, Yining, Chen, Mingzhe, Luo, Tao, Saad, Walid, Niyato, Dusit, Poor, H. Vincent, Cui, Shuguang

论文摘要

在本文中，提出了用于文本数据传输的语义通信框架。在研究的模型中，基站（BS）从文本数据中提取语义信息，并将其传输到每个用户。语义信息由由一组语义三元组组成的知识图（kg）建模。收到语义信息后，每个用户都使用图形到文本生成模型恢复原始文本。为了衡量所考虑的语义通信框架的性能，提出了共同捕获恢复文本的语义准确性和完整性的语义相似性（MSS）的指标。由于无线资源限制，BS可能无法将整个语义信息传输给每个用户并满足传输延迟约束。因此，BS必须为每个用户选择适当的资源块，并确定和将一部分语义信息发送给用户。因此，我们制定了一个优化问题，其目标是通过共同优化资源分配策略并确定要传输的部分语义信息来最大化总MSS。为了解决这个问题，提出了与注意力网络集成的基于近端优化的强化增强学习（RL）算法。所提出的算法可以使用注意网络在语义信息中评估每个三倍的重要性，然后在语义信息中三元组的重要性分布与总MSS之间建立关系。与传统的RL算法相比，所提出的算法可以动态调整其学习率，从而确保收敛到本地最佳解决方案。

In this paper, a semantic communication framework is proposed for textual data transmission. In the studied model, a base station (BS) extracts the semantic information from textual data, and transmits it to each user. The semantic information is modeled by a knowledge graph (KG) that consists of a set of semantic triples. After receiving the semantic information, each user recovers the original text using a graph-to-text generation model. To measure the performance of the considered semantic communication framework, a metric of semantic similarity (MSS) that jointly captures the semantic accuracy and completeness of the recovered text is proposed. Due to wireless resource limitations, the BS may not be able to transmit the entire semantic information to each user and satisfy the transmission delay constraint. Hence, the BS must select an appropriate resource block for each user as well as determine and transmit part of the semantic information to the users. As such, we formulate an optimization problem whose goal is to maximize the total MSS by jointly optimizing the resource allocation policy and determining the partial semantic information to be transmitted. To solve this problem, a proximal-policy-optimization-based reinforcement learning (RL) algorithm integrated with an attention network is proposed. The proposed algorithm can evaluate the importance of each triple in the semantic information using an attention network and then, build a relationship between the importance distribution of the triples in the semantic information and the total MSS. Compared to traditional RL algorithms, the proposed algorithm can dynamically adjust its learning rate thus ensuring convergence to a locally optimal solution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题