Abstract
Ontology matching is a crucial task for data integration and management on the Semantic Web. The ontology matching techniques today can solve many problems from heterogeneity of ontologies to some extent. However, for matching large ontologies, most ontology matchers take too long run time and have strong requirements on running environment. Based on the MapReduce framework and the virtual document technique, in this paper, we propose a 3-stage MapReduce-based approach called V-Doc+ for matching large ontologies, which significantly reduces the run time while keeping good precision and recall. Firstly, we establish four MapReduce processes to construct virtual document for each entity (class, property or instance), which consist of a simple process for the descriptions of entities, an iterative process for the descriptions of blank nodes and two processes for exchanging the descriptions with neighbors. Then, we use a word-weight-based partition method to calculate similarities between entities in the corresponding reducers. We report our results from two experiments on an OAEI dataset and a dataset from the biology domain. Its performance is assessed by comparing with existing ontology matchers. Additionally, we show how run time is reduced with increasing the size of cluster.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bleiholder, J., Naumann, F.: Data Fusion. ACM Computing Surveys 41(1), 1–41 (2008)
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM 51(1), 107–113 (2008)
Do, H., Rahm, E.: Matching Large Schemas: Approaches and Evaluation. Information Systems 32(6), 857–885 (2007)
Euzenat, J., Ferrara, A., Meilicke, C., Nikolov, A., Pane, J., Scharffe, F., Shvaiko, P., Stuckenschmidt, H., Šváb-Zamazal, O., Svátek, V., Trojahn, C.: Results of the Ontology Alignment Evaluation Initiative 2010. In: ISWC Workshop on Ontology Matching (2010)
Euzenat, J., Isaac, A., Meilicke, C., Shvaiko, P., Stuckenschmidt, H., Šváb, O., Svátek, V., Hage, W., Yatskevich, M.: First Results of the Ontology Alignment Evaluation Initiative 2007. In: ISWC Workshop on Ontology Matching (2007)
Euzenat, J., Shvaiko, P.: Ontology Matching. Springer, Heidelberg (2007)
Gross, A., Hartung, M., Kirsten, T., Rahm, E.: On Matching Large Life Science Ontologies in Parallel. In: Lambrix, P., Kemp, G. (eds.) DILS 2010. LNCS, vol. 6254, pp. 35–49. Springer, Heidelberg (2010)
Hu, W., Qu, Y., Cheng, G.: Matching Large Ontologies: A Divide-and-Conquer Approach. Data & Knowledge Engineering, 140–160 (2008)
Jean-Mary, Y., Shironoshita, E., Kabuka, M.: Ontology Matching with Semantic Verification. Journal of Web Semantics 7(3), 235–251 (2009)
Li, J., Tang, J., Li, Y., Luo, Q.: RiMOM: A Dynamic Multistrategy Ontology Alignment Framework. IEEE Transactions on Knowledge and Data Engineering 21(8), 1218–1232 (2009)
Mork, P., Bernstein, P.: Adapting a Generic Match Algorithm to Align Ontologies of Human Anatomy. In: Proceedings of the 20th International Conference on Data Engineering, pp. 787–790 (2004)
Moutselakis, E., Karakos, A.: Semantic Web Multimedia Metadata Retrieval: A Music Approach. In: 13th Panhellenic Conference on Informatics, pp. 43–47 (2009)
Mao, M., Peng, Y., Spring, M.: An Adaptive Ontology Mapping Approach with Neural Network Based Constraint Satisfaction. Web Semantics: Science. Services and Agents on the World Wide Web 8(1), 14–25 (2010)
McGill, M., Salton, G.: Introduction to Modern Information Retrieval. McGraw-Hill (1983)
Peukert, E., Berthold, H., Rahm, E.: Rewrite Techniques for Performance Optimization of Schema Matching Processes. In: Proceedings of 13th International Conference on Extending Database Technology, pp. 453–464. ACM Press, New York (2010)
Qu, Y., Hu, W., Cheng, G.: Constructing Virtual Documents for Ontology Matching. In: 15th International World Wide Web Conference, pp. 23–31. ACM Press, New York (2006)
Rahm, E.: Towards Large-Scale Schema and Ontology Matching. Data-Centric Systems and Applications, Part I, 3–27 (2011)
Rosse, C., Mejino, L.: The Foundational Model of Anatomy Ontology. In: Burger, A., Davidson, D., Baldock, R. (eds.) Anatomy Ontologies for Bioinformatics: Principles and Practice, vol. 6, Part I, pp. 59–117. Springer, Heidelberg (2008)
Stoilos, G., Stamou, G., Kollias, S.: A String Metric for Ontology Alignment. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005)
Vernica, R., Carey, M., Li, D.: Efficient Parallel Set-Similarity Joins Using MapReduce. In: SIGMOD 2010 Proceedings of the 2010 International Conference on Management of Data, pp. 495–506. ACM Press, New York (2010)
Vargas-Vera, M., Nagy, M.: Towards Intelligent Ontology Alignment Systems for Question Answering: Challenges and Roadblocks. Journal of Emerging Technologies in Web Intelligence 2(3), 244–257 (2010)
Wang, P., Zhou, Y., Xu, B.: Matching Large Ontologies Based on Reduction Anchors. In: Proceedings of International Joint Conferences on Artificial Intelligence, pp. 2343–2348 (2011)
Zhang, S., Bodenreider, O.: Hybrid Alignment Strategy for Anatomical Ontologies: Results of the 2007 Ontology Alignment Contest. In: ISWC Workshop on Ontology Matching (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, H., Hu, W., Qu, Y. (2012). Constructing Virtual Documents for Ontology Matching Using MapReduce. In: Pan, J.Z., et al. The Semantic Web. JIST 2011. Lecture Notes in Computer Science, vol 7185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29923-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-29923-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29922-3
Online ISBN: 978-3-642-29923-0
eBook Packages: Computer ScienceComputer Science (R0)