注意，填补路由问题中概括的空白

论文标题

注意，填补路由问题中概括的空白

Attention, Filling in The Gaps for Generalization in Routing Problems

论文作者

Bdeir, Ahmad, Falkner, Jonas K., Schmidt-Thieme, Lars

论文摘要

机器学习（ML）方法已成为解决车辆路由问题的有用工具，可以与流行的启发式方法或独立模型结合使用。但是，当解决不同尺寸或不同分布的问题时，当前的方法的概括不佳。结果，车辆路由中的ML见证了一个扩展阶段，为特定问题实例创建了新方法，这些方法在较大的问题大小上变得不可行。本文旨在通过理解和改善当前现有模型，即Kool等人的注意模型来鼓励该领域的巩固。我们确定了VRP概括的两个差异类别。第一个是基于问题本身固有的差异，第二个与限制模型概括能力的建筑弱点有关。我们的贡献变成了三倍：我们首先通过适应Kool等人来靶向模型差异。方法及其基于alpha-entmax激活的稀疏动态注意力的损耗函数。然后，我们通过使用混合实例训练方法来靶向固有的差异，该方法已被证明在某些情况下表现优于单个实例培训。最后，我们引入了一个推理水平数据增强框架，该框架通过利用模型缺乏旋转和扩张变化的不变性来提高性能。

Machine Learning (ML) methods have become a useful tool for tackling vehicle routing problems, either in combination with popular heuristics or as standalone models. However, current methods suffer from poor generalization when tackling problems of different sizes or different distributions. As a result, ML in vehicle routing has witnessed an expansion phase with new methodologies being created for particular problem instances that become infeasible at larger problem sizes. This paper aims at encouraging the consolidation of the field through understanding and improving current existing models, namely the attention model by Kool et al. We identify two discrepancy categories for VRP generalization. The first is based on the differences that are inherent to the problems themselves, and the second relates to architectural weaknesses that limit the model's ability to generalize. Our contribution becomes threefold: We first target model discrepancies by adapting the Kool et al. method and its loss function for Sparse Dynamic Attention based on the alpha-entmax activation. We then target inherent differences through the use of a mixed instance training method that has been shown to outperform single instance training in certain scenarios. Finally, we introduce a framework for inference level data augmentation that improves performance by leveraging the model's lack of invariance to rotation and dilation changes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题