论文标题
在云中解决批处理随机垃圾箱包装问题:一种偶然约束的优化方法
Solving the Batch Stochastic Bin Packing Problem in Cloud: A Chance-constrained Optimization Approach
论文作者
论文摘要
本文研究了第一个派对云中的关键资源分配问题:将容器调度到机器。有数十个服务,每种服务都运行一组具有动态资源使用的同质容器;服务的容器每天以批处理方式安排。这个问题可以自然地表达为随机垃圾箱包装问题(SBPP)。但是,传统的SBPP研究通常集中在空心机器的案例上,这些机器的目的(即,以最大程度地减少了用过的机器的数量),对于非空置机器而言,对更常见的现实没有明确的定义。本文旨在缩小这一差距。首先,我们定义了一种新的目标指标,使用的置信度(UCAC),该指标以可能性衡量最大使用的资源,并被证明对空的和非空的机器都一致,并在偶然的限制下重新制定了SBPP。其次,通过在生成方法中对容器资源使用分布进行建模,我们揭示了UCAC可以用高斯近似,这可以通过现实世界应用程序的跟踪数据进行验证。第三,我们通过解决同等的切割库存变种以及两个基于启发式的求解器(UCAC最佳拟合,双层启发式方法)提出了一个精确的求解器。我们在合成数据集和实际应用轨迹上对这些求解器进行了实验性评估,这证明了我们的方法论比传统的SBPP最佳求解器的优势最小化了使用的机器数量,但资源违规率较低。
This paper investigates a critical resource allocation problem in the first party cloud: scheduling containers to machines. There are tens of services and each service runs a set of homogeneous containers with dynamic resource usage; containers of a service are scheduled daily in a batch fashion. This problem can be naturally formulated as Stochastic Bin Packing Problem (SBPP). However, traditional SBPP research often focuses on cases of empty machines, whose objective, i.e., to minimize the number of used machines, is not well-defined for the more common reality with nonempty machines. This paper aims to close this gap. First, we define a new objective metric, Used Capacity at Confidence (UCaC), which measures the maximum used resources at a probability and is proved to be consistent for both empty and nonempty machines, and reformulate the SBPP under chance constraints. Second, by modeling the container resource usage distribution in a generative approach, we reveal that UCaC can be approximated with Gaussian, which is verified by trace data of real-world applications. Third, we propose an exact solver by solving the equivalent cutting stock variant as well as two heuristics-based solvers -- UCaC best fit, bi-level heuristics. We experimentally evaluate these solvers on both synthetic datasets and real application traces, demonstrating our methodology's advantage over traditional SBPP optimal solver minimizing the number of used machines, with a low rate of resource violations.