Abstract
Holistic schema matching is a fundamental challenge in the big data integration domain. Ideally, clusters of semantically corresponding elements are created and are updated as more schemas are matched. Developing a high-quality holistic schema matching approach is critical for two main reasons. First, identifying as many accurate and holistic semantic correspondences as possible right from the beginning. Second, reducing considerably the search space. Nevertheless, this problem is challenging since overlapping schema elements are not available. Identifying schema overlaps is further complicated for two main reasons: (1) there is a large number of schemas; and (2) overlaps vary for different schemas. In this paper we present HMO, a Holistic schema Matching approach based on schema Overlaps and designed for large-scale schemas. HMO can balance the search space and the quality of the holistic semantic correspondences. To narrow down the search space, HMO matches schemas based on their overlaps. To obtain high-accuracy, HMO uses an existing high-quality semantic similarity measure. Experimental results on four real-world domains show effectiveness and scalability of our matching approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aumueller, D., Do, H.-H., Massmann, S., Rahm, E.: Schema and ontology matching with coma++. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 906–908. ACM (2005)
Bernstein, P.A., Madhavan, J., Rahm, E.: Generic schema matching, ten years later. In: Proceedings of the VLDB Endowment, vol. 4, no. 11, pp. 695–701 (2011)
Do, H.-H., Rahm, E.: Coma: a system for flexible combination of schema matching approaches. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 610–621. VLDB Endowment (2002)
Ehrig, M., Staab, S.: QOM – quick ontology mapping. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 683–697. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30475-3_47
El Yazidi, M.H., Zellou, A., Idri, A.: Fmams: fuzzy mapping approach for mediation systems. Int. J. Appl. Evol. Comput. (IJAEC) 4(3), 34–46 (2013)
Giunchiglia, F., Autayeu, A., Pane, J.: S-match: an open source framework for matching lightweight ontologies. Semant. Web 3(3), 307–317 (2012)
Gruetze, T., Böhm, C., Naumann, F.: Holistic and scalable ontology alignment for linked open data. LDOW 937, 1–10 (2012)
Kastner, I., Adriaans, F.: Linguistic constraints on statistical word segmentation: the role of consonants in Arabic and English. Cogn. Sci. 42, 494–518 (2018)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. vldb 1, 49–58 (2001)
Rahm, E., Peukert, E.: Holistic schema matching (2019)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint cmp-lg/9511007 (1995)
Saleem, K., Bellahsene, Z., Hunt, E.: Porsche: performance oriented schema mediation. Inf. Syst. 33(7–8), 637–657 (2008)
Su, W., Wang, J., Lochovsky, F.: Holistic Schema Matching for Web Query Interfaces. In: Ioannidis, Y., et al. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 77–94. Springer, Heidelberg (2006). https://doi.org/10.1007/11687238_8
Yousfi, A., El Yazidi, M.H., Zellou, A.: hmatcher: matching schemas holistically. Int. J. Intell. Eng. Syst. 13(5), 490–501 (2020)
Yousfi, A., Elyazidi, M.H., Zellou, A.: Assessing the performance of a new semantic similarity measure designed for schema matching for mediation systems. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS (LNAI), vol. 11055, pp. 64–74. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98443-8_7
Yousfi, A., Yazidi, M.H.E., Zellou, A.: xmatcher: Matching extensible markup language schemas using semantic-based techniques. Int. J. Adv. Comput. Sci. Appl. 11(8) (2020)
Zhang, C., Chen, L., Jagadish, H., Zhang, M., Tong, Y.: Reducing uncertainty of schema matching via crowdsourcing with accuracy rates. IEEE Trans. Knowl. Data Eng. 32, 135–151 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Yousfi, A., Yazidi, M.H.E., Zellou, A. (2020). Towards a Holistic Schema Matching Approach Designed for Large-Scale Schemas. In: Nguyen, N.T., Hoang, B.H., Huynh, C.P., Hwang, D., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2020. Lecture Notes in Computer Science(), vol 12496. Springer, Cham. https://doi.org/10.1007/978-3-030-63007-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-63007-2_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63006-5
Online ISBN: 978-3-030-63007-2
eBook Packages: Computer ScienceComputer Science (R0)