随机资源消耗的在线资源分配

论文标题

随机资源消耗的在线资源分配

Online Resource Allocation with Stochastic Resource Consumption

论文作者

Jiang, Jiashuo, Zhang, Jiawei

论文摘要

我们考虑一个在线资源分配问题，其中多个具有单个初始容量的资源可用于在多个离散时间段中顺序到达的随机请求。在每个时间段内，一个请求到达，其相关的奖励和大小独立于可能依赖资源依赖的已知分布。到达并揭示自己后，必须就是否接受请求做出在线决定。如果被接受，还应做出另一个在线决定以指定分配用于请求的资源，那么一定数量的资源将消耗等于请求的大小，并将收集奖励。决策者的目的是最大化受能力限制的总收集奖励。我们在两个设置下分别为我们的问题制定了近乎最佳的政策。当奖励分布有有限的支持时，我们假设鉴于每个奖励实现，不同资源的大小分布完全是正相关的，我们提出了一种自适应阈值策略。我们表明，在尺寸分布的某些规律性条件下，我们的政策享有最佳$ O（\ log t）$遗憾，其中$ t $表示总时间段。当对奖励分布的支持不一定是有限的时，我们会制定另一种自适应阈值策略，并使用$ O（\ log t）$遗憾的是，当奖励和每个请求的尺寸均与资源无关，而大小分布都满足相同的条件时。

We consider an online resource allocation problem where multiple resources, each with an individual initial capacity, are available to serve random requests arriving sequentially over multiple discrete time periods. At each time period, one request arrives and its associated reward and size are drawn independently from a known distribution that could be resource-dependent. Upon its arrival and revealing itself, an online decision has to be made on whether or not to accept the request. If accepted, another online decision should also be made to specify on assigning which resource to serve the request, then a certain amount of the resource equal to the size of the request will be consumed and a reward will be collected. The objective of the decision maker is to maximize the total collected reward subject to the capacity constraints. We develop near-optimal policies for our problem under two settings separately. When the reward distribution has a finite support, we assume that given each reward realization, the size distributions for different resources are perfectly positive correlated and we propose an adaptive threshold policy. We show that with certain regularity conditions on the size distribution, our policy enjoys an optimal $O(\log T)$ regret bound, where $T$ denotes the total time periods. When the support of the reward distribution is not necessarily finite, we develop another adaptive threshold policy with a $O(\log T)$ regret bound when both the reward and the size of each request are resource-independent and the size distribution satisfies the same conditions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题