使用增强学习方法为乘车系统重新平衡多目标车辆

论文标题

使用增强学习方法为乘车系统重新平衡多目标车辆

Multi-Objective Vehicle Rebalancing for Ridehailing System using a Reinforcement Learning Approach

论文作者

Deng, Yuntian, Chen, Hao, Shao, Shiping, Tang, Jiacheng, Pi, Jianzong, Gupta, Abhishek

论文摘要

这里考虑了针对具有不对称需求的大规模乘车系统设计重新平衡算法的问题。我们在半马尔可夫决策问题（SMDP）框架内提出了重新平衡问题，该框架的封闭式车辆的车辆固定但不对称的需求是在一个有多个节点（代表社区）的大城市上。我们假设乘客在每个节点都排队，直到与车辆匹配为止。 SMDP的目的是最大程度地减少乘客等待时间的凸组合和整个空的车辆行驶。产生的SMDP似乎很难解决重新平衡策略的封闭式表达。结果，我们使用深度加固学习算法来确定SMDP的大致最佳解决方案。将训练有素的政策与其他众所周知的重新平衡算法进行了比较，这些算法旨在解决其他目标（例如最小化乘车问题的需求下降概率）。

The problem of designing a rebalancing algorithm for a large-scale ridehailing system with asymmetric demand is considered here. We pose the rebalancing problem within a semi Markov decision problem (SMDP) framework with closed queues of vehicles serving stationary, but asymmetric demand, over a large city with multiple nodes (representing neighborhoods). We assume that the passengers queue up at every node until they are matched with a vehicle. The goal of the SMDP is to minimize a convex combination of the waiting time of the passengers and the total empty vehicle miles traveled. The resulting SMDP appears to be difficult to solve for closed-form expression for the rebalancing strategy. As a result, we use a deep reinforcement learning algorithm to determine the approximately optimal solution to the SMDP. The trained policy is compared with other well-known algorithms for rebalancing, which are designed to address other objectives (such as to minimize demand drop probability) for the ridehailing problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题