Named Entity Tagging for Korean Using DL-CoTrain Algorithm | SpringerLink
Skip to main content

Named Entity Tagging for Korean Using DL-CoTrain Algorithm

  • Conference paper
Information Retrieval Technology (AIRS 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3689))

Included in the following conference series:

  • 1036 Accesses

Abstract

Our approach to solve the problem of Korean named entity classification adopted a co-training method called DL-CoTrain. We use only a part-of-speech tagger and a simple noun phrase chunker instead of a full parser to extract the contextual features of a named entity. We will discuss the linguistic features in Korean which are valuable for named entity classification and experimentally show how large a labeled corpus and which unlabeled corpus is necessary for the better performance and portability of a named entity classifier. With only about a quarter of the labeled corpus, our method can compete with its supervised counterpart.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co-training. In: Proceedings of the Workshop on Computational Learning Theory(COLT). Morgan Kaufmann Publishers, San Francisco (1998)

    Google Scholar 

  2. Cha, J., Lee, G., Lee, J.-H.: Generalized Unknown Morpheme Guessing for Hybrid POS Tagging of Korean. In: Proceedings of the Sixth Workshop on Very Large Corpora, pp. 85–93 (1998)

    Google Scholar 

  3. Collins, M., Singer, Y.: Unsupervised Models for Named Entity Classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. (1999)

    Google Scholar 

  4. Collins, M.J.: A New Statistical Parser Based on Bigram Lexical Dependencies. In: Joshi, A., Palmer, M. (eds.) Proceedings of the Thirty-Fourth Annual Meeting of the Association for Computational Linguistics, pp. 184–191. Morgan Kaufmann Publishers, San Francisco (1996)

    Google Scholar 

  5. Cucerzan, S., Yarowsky, D.: Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence. In: Proceedings of Joint SIGDAT Conference on EMNLP and VLC (1999)

    Google Scholar 

  6. Kim, H.-G., Kang, B.-M.: 21st Century Sejong Project - Compiling Korean Corpora. In: Proceedings of the 19th International Conference on Computer Processing of Oriental Languages (2001)

    Google Scholar 

  7. Kim, J.-H., Kwak, B.-K., Lee, S.-w., Lee, G., Lee, J.-H.: A Corpus-Based Learning Method of Compound Noun Indexing Rules for Korean. Information Retrieval 27(4), 115–132 (2001)

    Article  Google Scholar 

  8. MUC-6: Proceedings of The Sixth Message Understanding Conference (MUC-6). Morgan Kaufmann Publisher, San Francisco (1995)

    Google Scholar 

  9. MUC-7: Proceedings of The Seventh Message Understanding Conference (MUC-7) (1998)

    Google Scholar 

  10. Satoshi, S., Hitoshi, I.: IREX: IR and IE Evaluation Project in Japanese. In: Proceedings of the 2nd International Conference on Language Resources & Evaluation (2000)

    Google Scholar 

  11. Seon, C.-N., Ko, Y., Kim, J.-S., Seo, J.: Named Entity Recognition using Machine Learning Methods and Pattern-Selection Rules. In: Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium, pp. 229–236 (2001)

    Google Scholar 

  12. Utsuro, T., Sassano, M.: Minimally Supervised Japanese Named Entity Recognition: Resources and Evaluation. In: Proceedings of the 2nd International Conference on Language Resources & Evaluation (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kwak, BK., Cha, JW. (2005). Named Entity Tagging for Korean Using DL-CoTrain Algorithm. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.H. (eds) Information Retrieval Technology. AIRS 2005. Lecture Notes in Computer Science, vol 3689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562382_55

Download citation

  • DOI: https://doi.org/10.1007/11562382_55

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29186-2

  • Online ISBN: 978-3-540-32001-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics