Fine-Grained Tuple Transfer for Pipelined Query Execution on CPU-GPU Coprocessor

Yang, Zhenhua; Pan, Qingfeng; Xu, Chen

doi:10.1007/978-3-031-30637-2_2

Zhenhua Yang¹⁵,
Qingfeng Pan¹⁵ &
Chen Xu^15,16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13943))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2726 Accesses
1 Citations

Abstract

To leverage the massively parallel capability of GPU for query execution, GPU databases have been studied for over a decade. Recently, researchers proposed to execute queries with both CPU and GPU in a pipelined approach. In the pipelined query execution, the cross-processor tuple transfer plays a crucial role for the overall query execution performance. The state-of-the-art solution achieves cross-processor tuple transfer using a queue-like data structure. However, it is coarse-grained due to the use of a single spin lock to achieve thread-safety. This design causes performance issues as it prevents the threads from accessing the queue simultaneously. In this paper, we propose a fine-grained tuple transfer mechanism. It employs decoupled enqueue/dequeue to enable two threads on different processors to access the queue at the same time. Moreover, this mechanism explores subqueue-based locking to enable the threads on the same processor to access the queue at the same time. In particular, we implement a prototype system, namely \(\uppi \)QC, which adopts fine-grained tuple transfer. Our experiments show that \(\uppi \)QC achieves an order of magnitude better performance than existing GPU databases such as HeavyDB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 17159; Price includes VAT (Japan)

Softcover Book: JPY 21449; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Accelerating multi-way joins on the GPU

Article 02 November 2021

One size does not fit all: accelerating OLAP workloads with GPUs

Article 31 July 2020

Accelerated Parallel Hybrid GPU/CPU Hash Table Queries with String Keys

References

HeavyDB. https://www.heavy.ai/product/heavydb
Boncz, P.A., Zukowski, M., Nes, N.: Monetdb/x100: hyper-pipelining query execution. In: CIDR, pp. 225–237 (2005)
Google Scholar
Breß, S.: The design and implementation of CoGADB: a column-oriented GPU-accelerated DBMS. Datenbank-Spektrum 14(3), 199–209 (2014)
Article Google Scholar
Chrysogelos, P., Karpathiotakis, M., Appuswamy, R., Ailamaki, A.: Hetexchange: encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines. PVLDB 12(5), 544–556 (2019)
Google Scholar
Funke, H., Breß, S., Noll, S., Markl, V., Teubner, J.: Pipelined query processing in coprocessor environments. In: SIGMOD, pp. 1603–1618 (2018)
Google Scholar
Funke, H., Teubner, J.: Data-parallel query processing on non-uniform data. PVLDB 13(6), 884–897 (2020)
Google Scholar
Gaffney, K.P., Prammer, M., Brasfield, L.C., Hipp, D.R., Kennedy, D.R., Patel, J.M.: Sqlite: past, present, and future. PVLDB 15(12), 3535–3547 (2022)
Google Scholar
Gautier, T., Lima, J.V.F., Maillard, N., Raffin, B.: Xkaapi: a runtime system for data-flow task programming on heterogeneous architectures. In: IPDPS, pp. 1299–1308 (2013)
Google Scholar
Graefe, G.: Volcano - an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng. 6(1), 120–135 (1994)
Article Google Scholar
Lee, R., et al.: The art of balance: a rateupdb experience of building a CPU/GPU hybrid database product. PVLDB 14(12), 2999–3013 (2021)
Google Scholar
Li, Z., Peng, B., Weng, C.: Xeflow: streamlining inter-processor pipeline execution for the discrete CPU-GPU platform. IEEE Trans. Comput. 69(6), 819–831 (2020)
Article MATH Google Scholar
Neumann, T.: Efficiently compiling efficient query plans for modern hardware. PVLDB 4(9), 539–550 (2011)
Google Scholar
Paul, J., He, B., Lu, S., Lau, C.T.: Improving execution efficiency of just-in-time compilation based query processing on gpus. PVLDB 14(2), 202–214 (2020)
Google Scholar
Pedreira, P., et al.: Velox: meta’s unified execution engine. PVLDB 15(12), 3372–3384 (2022)
Google Scholar
Rossbach, C.J., Currey, J., Silberstein, M., Ray, B., Witchel, E.: PTask: operating system abstractions to manage GPUs as compute devices. In: SOSP, pp. 233–248 (2011)
Google Scholar
Shanbhag, A., Madden, S., Yu, X.: A study of the fundamental performance characteristics of GPUs and CPUs for database analytics. In: SIGMOD, pp. 1617–1632 (2020)
Google Scholar
Sioulas, P., Chrysogelos, P., Karpathiotakis, M., Appuswamy, R., Ailamaki, A.: Hardware-conscious hash-joins on gpus. In: ICDE, pp. 698–709 (2019)
Google Scholar
Yuan, Y., Lee, R., Zhang, X.: The yin and yang of processing data warehousing queries on GPU devices. PVLDB 6(10), 817–828 (2013)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 62272168), and Guangxi Key Laboratory of Trusted Software.

Author information

Authors and Affiliations

Shanghai Engineering Research Center of Big Data Management, East China Normal University, Shanghai, 200062, China
Zhenhua Yang, Qingfeng Pan & Chen Xu
Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin, 541004, China
Chen Xu

Authors

Zhenhua Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qingfeng Pan
View author publications
You can also search for this author in PubMed Google Scholar
Chen Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chen Xu .

Editor information

Editors and Affiliations

Tianjin University, Tianjin, China
Xin Wang
University of Torino, Turin, Italy
Maria Luisa Sapino
POSTECH, Pohang, Korea (Republic of)
Wook-Shin Han
University of California Santa Barbara, Santa Barbara, CA, USA
Amr El Abbadi
University of Auckland, Auckland, New Zealand
Gill Dobbie
Tianjin University, Tianjin, China
Zhiyong Feng
Beijing University of Posts and Telecommunications, Beijing, China
Yingxiao Shao
The University of Queensland, Brisbane, QLD, Australia
Hongzhi Yin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Z., Pan, Q., Xu, C. (2023). Fine-Grained Tuple Transfer for Pipelined Query Execution on CPU-GPU Coprocessor. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13943. Springer, Cham. https://doi.org/10.1007/978-3-031-30637-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-30637-2_2
Published: 14 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30636-5
Online ISBN: 978-3-031-30637-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fine-Grained Tuple Transfer for Pipelined Query Execution on CPU-GPU Coprocessor