查询处理张量计算运行时的处理

论文标题

查询处理张量计算运行时的处理

Query Processing on Tensor Computation Runtimes

论文作者

He, Dong, Nakandala, Supun, Banda, Dalitso, Sen, Rathijit, Saur, Karla, Park, Kwanghyun, Curino, Carlo, Camacho-Rodríguez, Jesús, Karanasos, Konstantinos, Interlandi, Matteo

论文摘要

人工智能（AI）对计算的巨大需求正在推动对AI的硬件和软件系统的无与伦比的投资。这导致了专用硬件设备数量的爆炸，现在由主要的云供应商提供。通过通过基于张量的界面隐藏低级复杂性，张量计算运行时间（TCR）（例如Pytorch）允许数据科学家有效利用新硬件提供的令人兴奋的功能。在本文中，我们探讨了数据库管理系统如何在AI空间中乘坐创新浪潮。我们设计，构建和评估张量查询处理器（TQP）：TQP将SQL查询转换为张量程序，并在TCR上执行它们。 TQP能够通过在张量例程上实现与关系运算符的新颖算法来运行完整的TPC-H基准。同时，TQP可以支持各种硬件，而仅需要通常的开发工作。实验表明，与专用CPU和GPU的系统相比，TQP可以将查询执行时间提高到10美元$ \ times $。最后，TQP可以加速查询ML预测和SQL端到端的查询，并在CPU基线上提供高达9 $ \ times $速度。

The huge demand for computation in artificial intelligence (AI) is driving unparalleled investments in hardware and software systems for AI. This leads to an explosion in the number of specialized hardware devices, which are now offered by major cloud vendors. By hiding the low-level complexity through a tensor-based interface, tensor computation runtimes (TCRs) such as PyTorch allow data scientists to efficiently exploit the exciting capabilities offered by the new hardware. In this paper, we explore how database management systems can ride the wave of innovation happening in the AI space. We design, build, and evaluate Tensor Query Processor (TQP): TQP transforms SQL queries into tensor programs and executes them on TCRs. TQP is able to run the full TPC-H benchmark by implementing novel algorithms for relational operators on the tensor routines. At the same time, TQP can support various hardware while only requiring a fraction of the usual development effort. Experiments show that TQP can improve query execution time by up to 10$\times$ over specialized CPU- and GPU-only systems. Finally, TQP can accelerate queries mixing ML predictions and SQL end-to-end, and deliver up to 9$\times$ speedup over CPU baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题