Abstract
A k-nearest neighbor (k-NN) query, which retrieves nearest k points from a database is one of the fundamental query types in spatial databases. An all k-nearest neighbor query (AkNN query), a variation of a k-NN query, determines the k-nearest neighbors for each point in the dataset in a query process. In this paper, we propose a method for processing AkNN queries in Hadoop. We decompose the given space into cells and execute a query using the MapReduce framework in a distributed and parallel manner. Using the distribution statistics of the target data points, our method can process given queries efficiently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Afrati, F.N., Ullman, J.D.: Optimizing joins in a map-reduce environment. In: Proc. EDBT, pp. 99–110 (2010)
Chen, Y., Patel, J.M.: Efficient evaluation of all-nearest-neighbor queries. In: Proc. ICDE 2007, pp. 1056–1065 (2007)
Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)
Emrich, T., Graf, F., Kriegel, H.-P., Schubert, M., Thoma, M.: Optimizing All-Nearest-Neighbor Queries with Trigonometric Pruning. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 501–518. Springer, Heidelberg (2010)
The apache software foundation: Hadoop homepage, http://hadoop.apache.org/
Jiang, D., Tung, A.K.H., Chen, G.: MAP-JOIN-REDUCE: Toward scalable and efficient data analysis on large clusters. IEEE TKDE 23(9), 1299–1311 (2011)
Samet, H.: The quadtree and related hierarchical data structures. ACM Computing Surveys 16(2), 187–260 (1984)
Vernica, R., Carey, M.J., Li, C.: Efficient parallel set-similarity joins using MapReduce. In: Proc. SIGMOD, pp. 495–506 (2010)
White, T.: Hadoop: The Definitive Guide. O’Reilly (2009)
Yokoyama, T., Ishikawa, Y., Suzuki, Y.: Processing all k-nearest neighbor queries in hadoop (long version) (2012), http://www.db.itc.nagoya-u.ac.jp/papers/2012-waim-long.pdf
Zhang, J., Mamoulis, N., Papadias, D., Tao, Y.: All-nearest-neighbors queries in spatial databases. In: Proc. SSDBM, pp. 297–306 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yokoyama, T., Ishikawa, Y., Suzuki, Y. (2012). Processing All k-Nearest Neighbor Queries in Hadoop. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds) Web-Age Information Management. WAIM 2012. Lecture Notes in Computer Science, vol 7418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32281-5_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-32281-5_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32280-8
Online ISBN: 978-3-642-32281-5
eBook Packages: Computer ScienceComputer Science (R0)