Abstract
Data Warehouses (DW) store valuable information not only for strategic business decisions, but also for operational daily decisions. As a consequence, a large number of queries are concurrently submitted, stressing the database engine ability to handle such query workloads without severely degrading query response times. The query-at-time model of common database engines, where each query is independently executed and competes for the same resources, is inefficient for handling large DWs and does not provides the expected performance and scalability when processing large numbers of concurrent queries. Related work shows that there’s a performance advantage on sharing data and processing, but the proposed solutions suffer from memory limitations, reduced scalability and unpredictable execution times when applied to large DWs, particularly those with large dimensions. SPIN proposes an approach to share computation and data among concurrent queries that delivers scale-up, even in the presence of massive query workloads. In this paper we describe the mechanisms used by SPIN to embed data and queries into a shared query processing pipeline tree and how SPIN dynamically reorganizes the processing tree. We also provide experimental validation of the approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Candea, G., Polyzotis, N., Vingralek, R.: A scalable, predictable join operator for highly concurrent data warehouses. Proc. VLDB Endow. 2, 277–288 (2009)
Candea, G., Polyzotis, N., Vingralek, R.: Predictable performance and high query concurrency for data analytics. VLDB J. 20(2), 227–248 (2011)
Zukowski, M., Héman, S., Nes, N., Boncz, P.: Cooperative scans: dynamic bandwidth sharing in a DBMS. In: Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria, pp. 723–734 (2007)
Harizopoulos, S., Shkapenyuk, V., Ailamaki, A.: QPipe: a simultaneously pipelined relational query engine. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 383–394 (2005)
Unterbrunner, P., Giannikis, G., Alonso, G., Fauser, D., Kossmann, D.: Predictable performance for unpredictable workloads. Proc. VLDB Endow. 2, 706–717 (2009)
Arumugam, S., Dobra, A., Jermaine, C.M., Pansare, N., Perez, L.: The DataPath system: a data-centric analytic processing engine for large data warehouses. In: Proceedings of the 2010 International Conference on Management of Data, pp. 519–530 (2010)
Giannikis, G., Alonso, G., Kossmann, D.: SharedDB: killing one thousand queries with one stone. Proc. VLDB Endow. 5(6), 526–537 (2012)
Costa, J.P., Cecílio, J., Martins, P., Furtado, P.: ONE: a predictable and scalable DW model. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 1–13. Springer, Heidelberg (2011)
Costa, J.P., Martins, P., Cecílio, J., Furtado, P.: A predictable storage model for scalable parallel DW. In: Fifteenth International Database Engineering and Applications Symposium (IDEAS 2011), Lisbon, Portugal (2011)
PostgreSQL. http://www.postgresql.org/
TPC-H Decision Support Benchmark. http://www.tpc.org/tpch/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A
Appendix A


Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Costa, J.P., Furtado, P. (2015). Data Warehouse Processing Scale-Up for Massive Concurrent Queries with SPIN. In: Hameurlain, A., Küng, J., Wagner, R., Bellatreche, L., Mohania, M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XVII. Lecture Notes in Computer Science(), vol 8970. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46335-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-662-46335-2_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46334-5
Online ISBN: 978-3-662-46335-2
eBook Packages: Computer ScienceComputer Science (R0)