Abstract
Declustering techniques reduce query response times through parallel I/O by distributing data among multiple devices. Except for a few cases it is not possible to find declustering schemes that are optimal for all spatial range queries. As a result of this, most of the research on declustering have focused on finding schemes with low worst case additive error. Recently, constrained declustering that maximizes the threshold k such that all spatial range queries ≤ k buckets are optimal is proposed. In this paper, we extend constrained declustering to high dimensions. We investigate high dimensional bound diagrams that are used to provide upper bound on threshold and propose a method to find good threshold-based declustering schemes in high dimensions. We show that using replicated declustering with threshold N, low worst case additive error can be achieved for many values of N. In addition, we propose a framework to find thresholds in replicated declustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abdel-Ghaffar, K.A.S., El Abbadi, A.: Optimal allocation of two-dimensional data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 409–418. Springer, Heidelberg (1996)
Atallah, M.J., Prabhakar, S.: (Almost) optimal parallel block access for range queries. In: Proc. ACM PODS, Dallas, Texas, May 2000, pp. 205–215 (2000)
Beckmann, N., Kriegel, H., Schneider, R., Seeger, B.: The R* tree: An efficient and robust access method for points and rectangles. In: Proc. ACM SIGMOD, May 23-25, pp. 322–331 (1990)
Berchtold, S., Bohm, C., Braunmuller, B., Keim, D.A., Kriegel, H.-P.: Fast parallel similarity search in multimedia databases. In: Proc. ACM SIGMOD, Arizona, U.S.A., pp. 1–12 (1997)
Bhatia, R., Sinha, R.K., Chen, C.-M.: Hierarchical declustering schemes for range queries. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 525–537. Springer, Heidelberg (2000)
Chen, C.-M., Bhatia, R., Sinha, R.: Declustering using golden ratio sequences. In: ICDE, San Diego, California, February 2000, pp. 271–280 (2000)
Chen, C.-M., Cheng, C.T.: From discrepancy to declustering: Near optimal multidimensional declustering strategies for range queries. In: Proc. ACM PODS, Wisconsin, Madison, pp. 29–38 (2002)
Chen, C.-M., Cheng, C.: Replication and retrieval strategies of multidimensional data on parallel disks. In: CIKM (October 2003)
Du, H.C., Sobolewski, J.S.: Disk allocation for cartesian product files on multiple-disk systems. ACM Trans. on Database Systems 7(1), 82–101 (1982)
Faloutsos, C., Metaxas, D.: Declustering using error correcting codes. In: Proc. ACM PODS, pp. 253–258 (1989)
Ferhatosmanoglu, H., Agrawal, D., El Abbadi, A.: Concentric hyperspaces and disk allocation for fast parallel range searching. In: Proc. ICDE, Sydney, Australia, March 1999, pp. 608–615 (1999)
Ferhatosmanoglu, H., Tosun, A.Ş., Ramachandran, A.: Replicated declustering of spatial data. In: 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (June 2004)
Gaede, V., Gunther, O.: Multidimensional access methods. ACM Computing Surveys 30, 170–231 (1998)
Ghandeharizadeh, S., DeWitt, D.J.: Hybrid-range partitioning strategy: A new declustering strategy for multiprocessor database machines. In: VLDB, August 1990, pp. 481–492 (1990)
Ghandeharizadeh, S., DeWitt, D.J.: A performance analysis of alternative multi-attribute declustering strategies. In: Proc. ACM SIGMOD, pp. 29–38 (1992)
Gray, J., Horst, B., Walker, M.: Parity striping of disc arrays: Low-cost reliable storage with acceptable throughput. In: Proc. VLDB, Washington DC, August 1990, pp. 148–161 (1990)
Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proc. ACM SIGMOD, pp. 47–57 (1984)
Hua, K.A., Young, H.C.: A general multidimensional data allocation method for multicomputer database systems. In: Tjoa, A.M. (ed.) DEXA 1997. LNCS, vol. 1308, pp. 401–409. Springer, Heidelberg (1997)
Kim, M.H., Pramanik, S.: Optimal file distribution for partial match retrieval. In: Proc. ACM SIGMOD, Chicago, pp. 173–182 (1988)
Prabhakar, S., Abdel-Ghaffar, K., Agrawal, D., El Abbadi, A.: Cyclic allocation of two-dimensional data. In: ICDE, Orlando, Florida, pp. 94–101 (1998)
Samet, H.: The Design and Analysis of Spatial Structures. Addison Wesley, Massachusetts (1989)
Tosun, A.S., Ferhatosmanoglu, H.: Optimal parallel I/O using replication. In: Proceedings of International Workshops on Parallel Processing (ICPP), Vancouver, Canada (August 2002)
Tosun, A.Ş.: Replicated declustering for arbitrary queries. In: 19th ACM Symposium on Applied Computing (March 2004)
Tosun, A.Ş.: Constrained declustering. In: International Conference on Information Technology Coding and Computing (April 2005)
Tosun, A.Ş.: Design theoretic approach to replicated declustering. In: International Conference on Information Technology Coding and Computing (April 2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tosun, A.Ş. (2005). Threshold Based Declustering in High Dimensions. In: Andersen, K.V., Debenham, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2005. Lecture Notes in Computer Science, vol 3588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11546924_80
Download citation
DOI: https://doi.org/10.1007/11546924_80
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28566-3
Online ISBN: 978-3-540-31729-6
eBook Packages: Computer ScienceComputer Science (R0)