论文标题

合奏抽样的分析

An Analysis of Ensemble Sampling

论文作者

Qin, Chao, Wen, Zheng, Lu, Xiuyuan, Van Roy, Benjamin

论文摘要

当在模型参数上维持确切的后验分布时,集合采样是与汤普森采样的实用近似。在本文中,我们建立了一个遗憾的束缚,以确保将集合采样应用于线性匪徒问题时确保理想的行为。这代表了对集合抽样的第一个严格的遗憾分析,并通过利用信息理论概念和新颖的分析技术来使其成为可能,这些技术可能在本文的范围之外被证明是有用的。

Ensemble sampling serves as a practical approximation to Thompson sampling when maintaining an exact posterior distribution over model parameters is computationally intractable. In this paper, we establish a regret bound that ensures desirable behavior when ensemble sampling is applied to the linear bandit problem. This represents the first rigorous regret analysis of ensemble sampling and is made possible by leveraging information-theoretic concepts and novel analytic techniques that may prove useful beyond the scope of this paper.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源