重新审视彩虹：促进更具洞察力和包容性的深入强化学习研究

论文标题

重新审视彩虹：促进更具洞察力和包容性的深入强化学习研究

Revisiting Rainbow: Promoting more Insightful and Inclusive Deep Reinforcement Learning Research

论文作者

Obando-Ceron, Johan S., Castro, Pablo Samuel

论文摘要

自从引入DQN以来，绝大多数强化学习研究都集中在用深层神经网络作为功能近似器的强化学习。通常在现在已成为标准的环境（例如Atari 2600游戏）上评估新方法。尽管这些基准有助于标准化评估，但它们的计算成本具有不幸的副作用，这是扩大具有足够访问计算资源的差距以及没有的差距。在这项工作中，我们认为，尽管社区着重于大规模环境，但传统的小规模环境仍然可以产生有价值的科学见解，并可以帮助减少贫困社区的进入障碍。为了证实我们的主张，我们从经验上重新审视了引入彩虹算法的论文[Hessel等，2018]，并对Rainbow使用的算法提出了一些新的见解。

Since the introduction of DQN, a vast majority of reinforcement learning research has focused on reinforcement learning with deep neural networks as function approximators. New methods are typically evaluated on a set of environments that have now become standard, such as Atari 2600 games. While these benchmarks help standardize evaluation, their computational cost has the unfortunate side effect of widening the gap between those with ample access to computational resources, and those without. In this work we argue that, despite the community's emphasis on large-scale environments, the traditional small-scale environments can still yield valuable scientific insights and can help reduce the barriers to entry for underprivileged communities. To substantiate our claims, we empirically revisit the paper which introduced the Rainbow algorithm [Hessel et al., 2018] and present some new insights into the algorithms used by Rainbow.

下载PDF全文

下载文献需遵守相关版权规定

论文标题