Abstract
For efficient query processing, a relational table should be indexed in multiple ways; for efficient database loading, indexes should be omitted. Moerkotte’s “small materialized aggregates” can be used to alleviate this tension, notably in the form of Netezza’s “zone maps.” Their most significant advantageous characteristics are that (i) load bandwidth is maximized by avoiding the cost of index maintenance, (ii) there is no need for complex index tuning, and (iii) scans for typical queries are very fast. Their most significant limiting characteristics are that (iv) they are effective only for query predicates on columns correlated with the load sequence, (v) individual outlier values can sharply reduce their effectiveness, and (vi) they fail to improve search performance within a zone.
In this research, we introduce zone filters and zone indexes that address these limitations without reducing the advantages. The new data structures can be created as side effects of the load process, with all required analyses accomplished while a moderate amount of new data still remains in the buffer pool. Traditional sorting and indexing are not required. Nonetheless, query performance matches that of zxone maps where those apply, exceeds it for predicates for which zone maps are ineffective, and can be comparable to query processing in a database with traditional indexing, as demonstrated in our simulations.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
[ADH 01] Ailamaki, A., DeWitt, D.J., Hill, M.D., Skounakis, M.: Weaving relations for cache performance. In: VLDB 2001, pp. 169–180 (2001)
[BB 02] Bumbulis, P., Bowman, I.: A compact B-tree. In: SIGMOD 2002, pp. 533–541 (2002)
[BM 70] Bayer, R., McCreight, E.M.: Organization and maintenance of large ordered indexes. In: SIGFIDET Workshop 1970, pp. 107–141 (1970)
[G 03] Graefe, G.: Sorting and indexing with partitioned B-trees. In: CIDR 2003(2003)
[G 06] Graefe, G.: B-tree indexes, interpolation search, and skew. In: DaMoN 2006, p. 5 (2006)
[GKK 01] Gärtner, A., Kemper, A., Kossmann, D., Zeller, B.: Efficient bulk deletes in relational databases. In: ICDE 2001, pp. 183–192 (2001)
[GL 01] Graefe, G., Larson, P.-Å.: B-Tree indexes and CPU caches. In: ICDE 2001, pp. 349–358 (2001)
[L 01] Lomet, D.B.: The evolution of effective B-tree page organization and techniques: A personal account. SIGMOD Record 30(3), 64–69 (2001)
[LC 86] Lehman, T.J., Carey, M.J.: A study of index structures for main memory database management systems. In: VLDB 1986, pp. 294–303 (1986)
[LJB 95] Leslie, H., Jain, R., Birdsall, D., Yaghmai, H.: Efficient search of multi-dimensional B-trees. In: VLDB 1995, pp. 710–719 (1995)
[M 98] Moerkotte, G.: Small materialized aggregates: A light weight index structure for data warehousing. In: VLDB 1998, pp. 476–487 (1998)
[RR 00] Rao, J., Ross, K.A.: Making B+-trees cache conscious in main memory. In: SIGMOD 2000, pp. 475–486 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Graefe, G. (2009). Fast Loads and Fast Queries. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2009. Lecture Notes in Computer Science, vol 5691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03730-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-03730-6_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03729-0
Online ISBN: 978-3-642-03730-6
eBook Packages: Computer ScienceComputer Science (R0)