基于近似理论的RKHS土匪方法

论文标题

基于近似理论的RKHS土匪方法

Approximation Theory Based Methods for RKHS Bandits

论文作者

Takemori, Sho, Sato, Masahiro

论文摘要

RKHS强盗问题（也称为内核多武器匪徒问题）是具有嘈杂反馈的非线性功能的在线优化问题。尽管问题已经进行了广泛的研究，但与研究精心的线性匪徒相比，某些问题的结果不令人满意。具体而言，对抗RKHS强盗问题没有一般算法。此外，现有算法的高计算复杂性阻碍了实际应用。我们通过考虑近似理论的新型融合和误指定的线性匪徒问题来解决这些问题。使用近似方法，我们提出了随机RKHS匪徒问题的有效算法和对对抗性RKHS匪徒问题的第一个常规算法。此外，我们从经验上表明，我们提出的一种方法与IGP-UCB具有可比的累积后悔，其运行时间较短。

The RKHS bandit problem (also called kernelized multi-armed bandit problem) is an online optimization problem of non-linear functions with noisy feedback. Although the problem has been extensively studied, there are unsatisfactory results for some problems compared to the well-studied linear bandit case. Specifically, there is no general algorithm for the adversarial RKHS bandit problem. In addition, high computational complexity of existing algorithms hinders practical application. We address these issues by considering a novel amalgamation of approximation theory and the misspecified linear bandit problem. Using an approximation method, we propose efficient algorithms for the stochastic RKHS bandit problem and the first general algorithm for the adversarial RKHS bandit problem. Furthermore, we empirically show that one of our proposed methods has comparable cumulative regret to IGP-UCB and its running time is much shorter.

下载PDF全文

下载文献需遵守相关版权规定

论文标题