Abstract
Entity typing is an essential task for constructing a knowledge base. However, many non-English knowledge bases fail to type their entities due to the absence of a reasonable local hierarchical taxonomy. Since constructing a widely accepted taxonomy is a hard problem, we propose to type these non-English entities with some widely accepted taxonomies in English, such as DBpedia, Yago and Freebase. We define this problem as cross-lingual type inference. In this paper, we present CUTE to type Chinese entities with DBpedia types. First we exploit the cross-lingual entity linking between Chinese and English entities to construct the training data. Then we propose a multi-label hierarchical classification algorithm to type these Chinese entities. Experimental results show the effectiveness and efficiency of our method.
Y. Xiao—This paper was supported by the National NSFC (No. 61472085, 61171132, 61033010, U1509213), by National Key Basic Research Program of China under No. 2015CB358800, by Shanghai Municipal Science and Technology Commission foundation key project under No. 15JC1400900. Seung-won Hwang was supported by Microsoft.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In this paper, we strictly differentiate “category” from “type”. “Category” always refer to the part of the knowledge base such as Baidu Baike and Wikipedia named as “category”. Most of these categories actually are only tags of an entity. Instead “type” always refer to the class that an entity can be classified into.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Palmero Aprosio, A., Giuliano, C., Lavelli, A.: Automatic expansion of DBpedia exploiting wikipedia cross-language information. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 397–411. Springer, Heidelberg (2013)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: a nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)
Dong, L., Wei, F., Sun, H., Zhou, M., Xu, K.: A hybrid neural model for type classification of entity mentions. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 1243–1249. AAAI Press (2015)
Gangemi, A., Nuzzolese, A.G., Presutti, V., Draicchio, F., Musetti, A., Ciancarini, P.: Automatic typing of DBpedia entities. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 65–81. Springer, Heidelberg (2012)
Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: Yago2: a spatially and temporally enhanced knowledge base from wikipedia. Artif. Intell. 194, 28–61 (2013)
Lee, T., Wang, Z., Wang, H., Hwang, S.W.: Attribute extraction and scoring: a probabilistic approach. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 194–205. IEEE (2013)
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., et al.: DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 5, 1–29 (2014)
Ling, X., Weld, D.S.: Fine-grained entity recognition. In: AAAI. Citeseer (2012)
Murdock, J.W., Kalyanpur, A., Welty, C., Fan, J., Ferrucci, D.A., Gondek, D., Zhang, L., Kanayama, H.: Typing candidate answers using type coercion. IBM J. Res. Dev. 56(3.4), 7:1–7:13 (2012)
Nakashole, N., Tylenda, T., Weikum, G.: Fine-grained semantic typing of emerging entities. In: ACL (1), pp. 1488–1497 (2013)
Passant, A.: dbrec — music recommendations using DBpedia. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part II. LNCS, vol. 6497, pp. 209–224. Springer, Heidelberg (2010)
Paulheim, H., Bizer, C.: Type inference on noisy RDF data. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 510–525. Springer, Heidelberg (2013)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pohl, A.: Classifying the wikipedia articles into the opencyc taxonomy. In: Proceedings of the Web of Linked Entities Workshop in Conjuction with the 11th International Semantic Web Conference, vol. 5, p. 16 (2012)
Ponzetto, S.P., Strube, M.: Deriving a large scale taxonomy from wikipedia. In: AAAI, vol. 7, pp. 1440–1445 (2007)
Ritter, A., Clark, S., Etzioni, O., et al.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534 (2011)
Silla Jr., C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Discov. 22(1–2), 31–72 (2011)
Srinivas, S.: A generalization of the noisy-or model. In: Proceedings of the Ninth International Conference on Uncertainty in Artificial Intelligence, pp. 208–215. Morgan Kaufmann Publishers Inc. (1993)
Suchanek, F.M., Abiteboul, S., Senellart, P.: Paris: probabilistic alignment of relations, instances, and schema. Proc. VLDB endowment 5(3), 157–168 (2011)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM (2007)
Wang, Z., Li, J., Tang, J.: Boosting cross-lingual knowledge linking via concept annotation. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 2733–2739. AAAI Press (2013)
Wang, Z., Li, J., Wang, Z., Tang, J.: Cross-lingual knowledge linking across wiki knowledge bases. In: Proceedings of the 21st International Conference on World Wide Web, pp. 459–468. ACM (2012)
Yosef, M.A., Bauer, S., Hoffart, J., Spaniol, M., Weikum, G.: Hyena: hierarchical type classification for entity names (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, B., Zhang, Y., Liang, J., Xiao, Y., Hwang, Sw., Wang, W. (2016). Cross-Lingual Type Inference. In: Navathe, S., Wu, W., Shekhar, S., Du, X., Wang, X., Xiong, H. (eds) Database Systems for Advanced Applications. DASFAA 2016. Lecture Notes in Computer Science(), vol 9642. Springer, Cham. https://doi.org/10.1007/978-3-319-32025-0_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-32025-0_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32024-3
Online ISBN: 978-3-319-32025-0
eBook Packages: Computer ScienceComputer Science (R0)