Abstract
Geospatial location matching plays a significant role in spatial databases. In this paper, we propose and study a novel parallel spatio-textual location matching (STLM) query. Given two sets P and Q of spatial locations with textual attributes, a spatio-textual matching threshold \(\theta \), the STLM query finds all location pairs whose spatio-textual similarity exceeds \(\theta \). We believe that the STLM query is useful in many applications such as important location/hot region detection, duplicate spatio-textual data cleaning, and location based services in general. The STLM query is challenging due to three reasons: (1) how to evaluate the spatio-textual similarity between two locations practically, (2) how to prune the search space effectively in both spatial and textual domains, and (3) how to process the STLM query in parallel because of its high computation complexity. To overcome the challenges, we develop a novel direct matching (DM) search algorithm. A linear combination method is adopted to combine the spatial proximity and textual similarity together. To further improve the query efficiency, we develop a grid-based expansion scheduling scheme based on a purposeful grid index structure. We conduct extensive experiments on real and synthetic spatio-textual data sets to verify the performance of the developed algorithms.







Similar content being viewed by others
References
Cao, X., Chen, L., Cong, G., Guan, J., Phan, N., Xiao, X.: KORS: keyword-aware optimal route search system. In: ICDE, pp. 1340–1343 (2013)
Cao, X., Chen, L., Cong, G., Jensen, C.S., Qu, Q., Skovsgaard, A., Wu, D., Yiu, M.L.: Spatial keyword querying. In: ER, vol. 7532, pp. 16–29. Springer (2012)
Cao, X., Chen, L., Cong, G., Xiao, X.: Keyword-aware optimal route search. PVLDB 5(11), 1136–1147 (2012)
Chen, L., Cong, G., Cao, X.: An efficient query indexing mechanism for filtering geo-textual data. In: SIGMOD, pp. 749–760 (2013)
Chen, L., Cong, G., Cao, X., Tan, K.: Temporal spatial-keyword top-k publish/subscribe. In: ICDE, pp. 255–266 (2015)
Chen, L., Cong, G., Jensen, C.S., Wu, D.: Spatial keyword query processing: an experimental evaluation. PVLDB 6(3), 217–228 (2013)
Chen, L., Cui, Y., Cong, G., Cao, X.: SOPS: a system for efficient processing of spatial-keyword publish/subscribe. PVLDB 7(13), 1601–1604 (2014)
Chen, L., Shang, S.: Approximate spatio-temporal top-k publish/subscribe. World Wide Web 22(5), 2153–2175 (2019)
Chen, L., Shang, S.: Region-based message exploration over spatio-temporal data streams. In: AAAI, pp. 873–880 (2019)
Chen, L., Shang, S., Jensen, C.S., Xu, J., Kalnis, P., Yao, B., Shao, L.: Top-k term publish/subscribe for geo-textual data streams. VLDB J., online first, (2020)
Chen, L., Shang, S., Jensen, C.S., Yao, B., Zhang, Z., Shao, L.: Effective and efficient reuse of past travel behavior for route recommendation. In: KDD, pp. 488–498 (2019)
Chen, L., Shang, S., Yang, C., Li, J.: Spatial keyword search: a survey. GeoInformatica 24(1), 85–106 (2020)
Chen, L., Shang, S., Yao, B., Zheng, K.: Spatio-temporal top-k term search over sliding window. World Wide Web 22(5), 1953–1970 (2019)
Chen, L., Shang, S., Zhang, Z., Cao, X., Jensen, C.S., Kalnis, P.: Location-aware top-k term publish/subscribe. In: ICDE, pp. 749–760 (2018)
Chen, L., Shang, S., Zheng, K., Kalnis, P.: Cluster-based subscription matching for geo-textual data streams. In: ICDE, pp. 890–901 (2019)
Chen, Z., Cong, G., Zhang, Z., Fu, T.Z.J., Chen, L.: Distributed publish/subscribe query processing on the spatio-textual data stream. In: ICDE, pp. 1095–1106 (2017)
Chen, Z., Shen, H.T., Zhou, X., Zheng, Y., Xie, X.: Searching trajectories by locations: an efficiency study. In: SIGMOD, pp. 255–266 (2010)
Kou, N.M., Li, Y., Wang, H., U, L.H., Gong, Z.: Crowdsourced top-k queries by confidence-aware pairwise judgments. In: SIGMOD (2017)
wang, Hao, fan, shunguo, song, jinhua, gao, yang, chen, xingguo: R. learning transfer based on subgoal discovery and subtask similarity. IEEE/CAA J. Autom. Sin. 1(3), 252–266 (2014)
Li, M., Chen, L., Cong, G., Gu, Y., Yu, G.: Efficient processing of location-aware group preference queries. In: CIKM, pp. 559–568 (2016)
Li, Y., Kou, N.M., Wang, H., U, L.H., Gong, Z.: A confidence-aware top-k query processing toolkit on crowdsourcing. In: VLDB (2017)
Liu, J., Zhao, K., Sommer, P., Shang, S., Kusy, B., Lee, J., Jurdak, R.: A novel framework for online amnesic trajectory compression in resource-constrained environments. IEEE Trans. Knowl. Data Eng. 28(11), 2827–2841 (2016)
Liu, K., Yang, B., Shang, S., Li, Y., Ding, Z.: MOIR/UOTS: trip recommendation with user oriented trajectory search. In: MDM, pp. 335–337 (2013)
Lu, Z., Wang, H., Mamoulis, N., Tu, W., Cheung, D.W.: Personalized location recommendation by aggregating multiple recomenders in diversity. Geoinformatica 21(3), 459–484 (2017)
Magdy, A., Abdelhafeez, L., Kang, Y., Ong, E., Mokbel, M.F.: Microblogs data management: a survey. VLDB J. 29(1), 177–216 (2020)
Mahmood, A.R., Aref, W.G.: Scalable Processing of Spatial-Keyword Queries. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2019)
Mahmood, A.R., Aref, W.G., Aly, A.M., Tang, M.: Atlas: on the expression of spatial-keyword group queries using extended relational constructs. In: SIGSPATIAL, vol. 45, pp. 1–10 (2016)
Mouratidis, K., Li, J., Tang, Y., Mamoulis, N.: Joint search by social and spatial proximity. IEEE Trans. Knowl. Data Eng. 27(3), 781–793 (2015)
Shang, S., Chen, L., Jensen, C.S., Wen, J., Kalnis, P.: Searching trajectories by regions of interest. IEEE Trans. Knowl. Data Eng. 29(7), 1549–1562 (2017)
Shang, S., Chen, L., Wei, Z., Guo, D., Wen, J.: Dynamic shortest path monitoring in spatial networks. J. Comput. Sci. Technol. 31(4), 637–648 (2016)
Shang, S., Chen, L., Wei, Z., Jensen, C.S., Wen, J., Kalnis, P.: Collective travel planning in spatial networks. IEEE Trans. Knowl. Data Eng. 28(5), 1132–1146 (2016)
Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Trajectory similarity join in spatial networks. PVLDB 10(11), 1178–1189 (2017)
Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Parallel trajectory similarity joins in spatial networks. VLDB J. 27(3), 395–420 (2018)
Shang, S., Chen, L., Zheng, K., Jensen, C.S., Wei, Z., Kalnis, P.: Parallel trajectory-to-location join. IEEE Trans. Knowl. Data Eng. 31(6), 1194–1207 (2019)
Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. In: EDBT, pp. 156–167 (2012)
Shang, S., Ding, R., Zheng, K., Jensen, C.S., Kalnis, P., Zhou, X.: Personalized trajectory matching in spatial networks. VLDB J. 23(3), 449–468 (2014)
Shang, S., Lu, H., Pedersen, T.B., Xie, X.: Finding traffic-aware fastest paths in spatial networks. SSTD 8098, 128–145 (2013)
Shang, S., Yuan, B., Deng, K., Xie, K., Zheng, K., Zhou, X.: Pnn query processing on compressed trajectories. GeoInformatica 16(3), 467–496 (2012)
Skovsgaard, A., Sidlauskas, D., Jensen, C.S.: Scalable top-k spatio-temporal term querying. In: ICDE, pp. 148–159 (2014)
Song, J., Wang, H., Gao, Y., An, B.: Active learning with confidence-based answers for crowdsourcing labeling tasks. Knowl. Based Syst. 159(1), 244–258 (2018)
Wang, H., Cai, Y., Yang, Y., Zhang, S., Mamoulis, N.: Durable queries over historical time series. IEEE Trans. Knowl. Data Eng. 26(3), 595–607 (2014)
Wang, H., Dong, S., Shao, L.: Measuring structual similarities in finte MDPs. In: IJCAI (2019)
Wang, H., Gao, Y., Shi, Y., Wang, H.: A fast distributed classification algorithm for large-scale imbalanced data. In: ICDM (2016)
Wang, H., Lu, Z.: Preference-aware sequence matching for location-based services. Geoinformatica 24(1), 107–131 (2020)
Wang, H., Pan, N., U, L.H., Zhan, B., Gong, Z.: On dynamic top-k influence maximization. In: WAIM (2015)
Wang, H., Terrovitis, M., Mamoulis, N.: Location recommendation in location-based social networks using user check-in data. In: SIGSPATIAL (2013)
Xie, K., Deng, K., Shang, S., Zhou, X., Zheng, K.: Finding alternative shortest paths in spatial networks. ACM Trans. Database Syst. 37(4), 29:1–29:31 (2012)
Xu, Y., Chen, L., Yao, B., Shang, S., Zhu, S., Zheng, K., Li, F.: Location-based top-k term querying over sliding window. In: WISE, pp. 299–314 (2017)
Yang, B., Guo, C., Jensen, C.S., Kaul, M., Shang, S.: Stochastic skyline route planning under time-varying uncertainty. In: ICDE, pp. 136–147 (2014)
Yang, C., Chen, L., Shang, S., Zhu, F., Liu, L., Shao, L.: Toward efficient navigation of massive-scale geo-textual streams. In: IJCAI, pp. 4838–4845 (2019)
Yang, S., Gao, Y., An, B., Wang, H., Chen, X.: Efficient average reward reinforcement learning using constant shifting values. In: AAAI (2016)
Yang, S., Wang, H., Gao, Y., Chen, X.: An optimal algorithm for the stochastic bandits with knowing near-optimal mean reward. In: AAMAS (2018)
Yu, Y., Gao, Y., Wang, H., Wang, R.: Joint user knowledge and matrix factorization for recommender systems. World Wide Web 21(4), 1141–1163 (2018)
Yu, Y., Wang, C., Wang, H., Gao, Y.: Attributes coupling based matrix factorization for item recommendation. Appl. Intell. 46(3), 521–533 (2017)
Yu, Y., Wang, H., Sun, S., Gao, Y.: Exploiting location significance and user authority for point-of-interest recommendation. In: PAKDD (2017)
Zhai, T., Gao, Y., Wang, H., Cao, L.: Classification of high-dimensional evolving data streams via a resource-efficient online ensemble. Data Min. Knowl. Discov. 31(5), 1242–1265 (2017)
Zhai, T., Koriche, F., Wang, H., Gao, Y.: Tracking sparse linear classifiers. IEEE Trans. Neural Netw. Learn. Syst. 30(7), 2079–2092 (2018)
Zhai, T., Wang, H., Koriche, F., Gao, Y.: Online feature selection by adaptive sub-gradient methods. In: ECML-PKDD (2018)
Zhang, C., Wang, H., Yang, S., Gao, Y.: A contextual bandit approach to personalized online recommendation via sparse intersections. In: PAKDD (2019)
Zhao, K., Chen, L., Cong, G.: Topic exploration in spatio-temporal document collections. In: SIGMOD, pp. 985–998 (2016)
Zhao, Y., Shang, S., Wang, Y., Zheng, B., Nguyen, Q.V.H., Zheng, K.: REST: A reference-based framework for spatio-temporal trajectory compression. In: Guo, Y., Farooq, F. editors, KDD, pp. 2797–2806 (2018)
Zheng, B., Wang, H., Zheng, K., Su, H., Liu, K., Shang, S.: Sharkdb: an in-memory column-oriented storage for trajectory analysis. World Wide Web 21(2), 455–485 (2018)
Zheng, B., Yuan, N.J., Zheng, K., Xie, X., Sadiq, S.W., Zhou, X.: Approximate keyword search in semantic trajectory database. In: ICDE, pp. 975–986 (2015)
Zheng, K., Shang, S., Yuan, N.J., Yang, Y.: Towards efficient search for activity trajectories. In: ICDE, pp. 230–241 (2013)
Zheng, K., Zheng, B., Xu, J., Liu, G., Liu, A., Li, Z.: Popularity-aware spatial keyword search on activity trajectories. World Wide Web 20(4), 749–773 (2017)
Acknowledgements
This study is supported by the Program of New Century Excellent Talents in Fujian Province University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, N., Zeng, J., Chen, M. et al. An efficient algorithm for spatio-textual location matching. Distrib Parallel Databases 38, 649–666 (2020). https://doi.org/10.1007/s10619-020-07289-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-020-07289-9