Abstract
One of the main challenges in the Data Web is the identification of instances that refer to the same real-world entity. Choosing the right framework for this purpose remains tedious, as current instance matching benchmarks fail to provide end users and developers with the necessary insights pertaining to how current frameworks behave when dealing with real data. In this paper, we present lance, a domain-independent instance matching benchmark generator which focuses on benchmarking instance matching systems for Linked Data. lance is the first Linked Data benchmark generator to support complex semantics-aware test cases that take into account expressive OWL constructs, in addition to the standard test cases related to structure and value transformations. lance supports the definition of matching tasks with varying degrees of difficulty and produces a weighted gold standard, which allows a more fine-grained analysis of the performance of instance matching tools. It can accept any linked dataset and its accompanying schema as input to produce a target dataset implementing test cases of varying levels of difficulty. We provide a comparative analysis with lance benchmarks to assess and identify the capabilities of state of the art instance matching systems as well as an evaluation to demonstrate the scalability of lance’s test case generator.
This work was partially supported by the EU FP7 projects LDBC (FP7-ICT-2011-8 #317548) and H2020 PARTHENOS (#654119).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bhattacharya, I., Getoor, L.: Entity resolution in graphs. Mining Graph Data. Wiley and Sons (2006)
Elmagarmid, A.K., Ipeirotis, P.G., et al.: Duplicate Record Detection: A Survey. TKDE 19(1) (2007)
Li, C., Jin, L., et al.: Supporting efficient record linkage for large data sets using mapping techniques. In: WWW (2006)
Noessner, J., Niepert, M., Meilicke, C., Stuckenschmidt, H.: Leveraging terminological structure for object reconciliation. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6089, pp. 334–348. Springer, Heidelberg (2010)
Isele, R., Jentzsch, A., et al.: Silk server - adding missing links while consuming linked data. In: COLD (2010)
Ngonga Ngomo, A.-C., Auer, S.: LIMES - a time-efficient approach for large-scale link discovery on the web of data. IJCAI (2011)
Stefanidis, K., Efthymiou, V., et al.: Entity resolution in the web of data. In: WWW, Companion Volume (2014)
Weis, M., Naumann, F., et al.: A duplicate detection benchmark for XML and relational data. In: IQIS (2006)
Ontology Alignment Evaluation Initiative. http://oaei.ontologymatching.org/
Zaiss, K., Conrad, S., et al.: A benchmark for testing instance-based ontology matching methods. In: KMIS (2010)
Alexe, B., Tan, W.-C., et al.: STBenchmark: towards a benchmark for mapping systems. In: PVLDB (2008)
Saveta, T., Daskalaki, E., et al.: Pushing the limits of instance matching systems: a semantics-aware benchmark for linked data. In: WWW, Companion Volume (2015)
Daskalaki, E., Fundulaki, I., et al.: Instance matching benchmarks for linked data. In: ISWC (Tutorial) (2014)
Euzenat, J., Ferrara, A., et al.: Results of the ontology alignment evaluation initiative 2009. In: ISWC Workshop on Ontology Matching (OM) (2009)
Euzenat, J.: Results of the ontology alignment evaluation initiative. In: OM (2010)
OAEI Instance Matching (2010). http://oaei.ontologymatching.org/2010
Euzenat, J., et. al.: Final results of the ontology alignment evaluation initiative 2011. In: OM (2011)
Aguirre, J.L., et. al.: Results of the ontology alignment evaluation initiative 2012. In: OM (2012)
Dragisic, Z., Eckert, K., et al.: Results of the ontology alignment evaluation initiative 2013. In: OM (2013)
Dragisic, Z., Eckert, K., et al.: Results of the ontology alignment evaluation initiative 2014. In: OM (2014)
Ferrara, A., Montanelli, S., Noessner, J., Stuckenschmidt, H.: Benchmarking matching applications on the semantic web. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part II. LNCS, vol. 6644, pp. 108–122. Springer, Heidelberg (2011)
Krompass, D., Nickel, M., et al.: Non-negative tensor factorization with RESCAL. In: TML (2013)
Nickel, M., Tresp, V., et al.: Factorizing YAGO: scalable machine learning for linked data. In: WWW (2012)
Goutte, C., Gaussier, É.: A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005)
Jiménez-Ruiz, E., Cuenca Grau, B.: LogMap: logic-based and scalable ontology matching. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 273–288. Springer, Heidelberg (2011)
Romero, A.A., Grau, B.C., et al.: MORe: a modular OWL reasoner for ontology classification. In: ORE, pp. 61–67 (2013)
Daskalaki, E., Plexousakis, D.: OtO matching system: a multi-strategy approach to instance matching. In: Ralyté, J., Franch, X., Brinkkemper, S., Wrycza, S. (eds.) CAiSE 2012. LNCS, vol. 7328, pp. 286–300. Springer, Heidelberg (2012)
Ngomo, A.-C.N., Lyko, K.: EAGLE: efficient active learning of link specifications using genetic programming. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 149–163. Springer, Heidelberg (2012)
Li, J., Tang, J., et al.: Rimom: A dynamic multistrategy ontology alignment framework. TKDE 21(8) (2009)
Massmann, S., Raunich, S., et al.: Evolution of the COMA match system. Ontology Matching 49 (2011)
Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. IJSWIS 5(2) (2009)
Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011)
Ma, L., Yang, Y., Qiu, Z., Xie, G., Pan, Y., Liu, S.: Towards a complete OWL ontology benchmark. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 125–139. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Saveta, T., Daskalaki, E., Flouris, G., Fundulaki, I., Herschel, M., Ngomo, AC.N. (2015). LANCE: Piercing to the Heart of Instance Matching Tools. In: Arenas, M., et al. The Semantic Web - ISWC 2015. ISWC 2015. Lecture Notes in Computer Science(), vol 9366. Springer, Cham. https://doi.org/10.1007/978-3-319-25007-6_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-25007-6_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25006-9
Online ISBN: 978-3-319-25007-6
eBook Packages: Computer ScienceComputer Science (R0)