实时招标的功能优化强化学习

论文标题

实时招标的功能优化强化学习

Functional Optimization Reinforcement Learning for Real-Time Bidding

论文作者

Lu, Yining, Lu, Changjie, Bandyopadhyay, Naina, Kumar, Manoj, Gupta, Gaurav

论文摘要

实时竞标是编程广告的新范式。广告商希望做出使用\ textbf {需求端平台}来提高其广告活动的性能的聪明选择。由于随机招标行为，现有的方法正在努力为优化提供令人满意的解决方案。在本文中，我们提出了具有功能优化的RTB的多代理增强学习体系结构。我们设计了四个代理商竞标环境：基于三个Lagrange-Multiplier的功能优化剂和一个基线代理（没有功能优化的任何属性）首先，已将大量属性分配给每个代理，包括偏见或无偏的胜利概率，Lagrange乘数，Lagrange乘数和点击率。为了评估拟议的RTB策略的性能，我们在十个顺序模拟拍卖活动中演示了结果。结果表明，具有功能性动作和奖励的代理商分别具有偏见和公正的获胜信息，具有最重要的平均获胜率和赢得盈余。实验评估表明，我们的方法显着提高了运动的功效和盈利能力。

Real-time bidding is the new paradigm of programmatic advertising. An advertiser wants to make the intelligent choice of utilizing a \textbf{Demand-Side Platform} to improve the performance of their ad campaigns. Existing approaches are struggling to provide a satisfactory solution for bidding optimization due to stochastic bidding behavior. In this paper, we proposed a multi-agent reinforcement learning architecture for RTB with functional optimization. We designed four agents bidding environment: three Lagrange-multiplier based functional optimization agents and one baseline agent (without any attribute of functional optimization) First, numerous attributes have been assigned to each agent, including biased or unbiased win probability, Lagrange multiplier, and click-through rate. In order to evaluate the proposed RTB strategy's performance, we demonstrate the results on ten sequential simulated auction campaigns. The results show that agents with functional actions and rewards had the most significant average winning rate and winning surplus, given biased and unbiased winning information respectively. The experimental evaluations show that our approach significantly improve the campaign's efficacy and profitability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题