The Logical-Linguistic Model of Fact Extraction from English Texts | SpringerLink
Skip to main content

The Logical-Linguistic Model of Fact Extraction from English Texts

  • Conference paper
  • First Online:
Information and Software Technologies (ICIST 2016)

Abstract

In this paper we suggest the logical-linguistic model that allows extracting required facts from English sentences. We consider the fact in the form of a triplet: Subject > Predicate > Object with the Predicate representing relations and the Object and Subject pointing out two entities. The logical-linguistic model is based on the use of the grammatical and semantic features of words in English sentences. Basic mathematical characteristic of our model is logical-algebraic equations of the finite predicates algebra. The model was successfully implemented in the system that extracts and identifies some facts from Web-content of a semi-structured and non-structured English text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Fader, S., Soderland, O.: Etzioni Identifying relations for open information extraction. In: Conference on Empirical Methods in Natural Language Processing. Edinburgh, Scotland, pp. 1535–1545 (2011)

    Google Scholar 

  2. Sint, R., Schaffert, S., Stroka, S., Ferstl, R.: Combining unstructured, fully structured and semi-structured information in semantic wikis. In: Proceedings of the 4th Semantic Wiki WorkShop (SemWiki) at the 6th European Semantic Web Conference, ESWC (2009)

    Google Scholar 

  3. Crestan, E., Pantel, P.: Web-scale knowledge extraction from semi-structured tables. In: WWW 2010 Proceedings of the 19th International Conference on World Wide Web, pp. 1081–1082 (2010)

    Google Scholar 

  4. Gatterbauer, W., Bohunsky, P., Herzog, M., Krupl, B., Pollak, B.: Towards domain-independent information extraction from web tables. In: Proceedings WWW-07, pp. 71–80. Banff, Canada (2007)

    Google Scholar 

  5. Wong, Y.W., Widdows, D., Lokovic, T., Nigam, K.: Scalable attribute-value extraction from semi-structured text. In: 2009 IEEE International Conference on Data Mining Workshops, pp. 302–307 (2009)

    Google Scholar 

  6. Phillips, W., Riloff, E.: Exploiting strong syntactic heuristics and co-training to learn semantic lexicons. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (2002)

    Google Scholar 

  7. Jones, R., Ghani, R., Mitchell, T., Riloff, E.: Active learning with multiple view feature sets. In: ECML 2003 Workshop on Adaptive Text Extraction and Mining (2003)

    Google Scholar 

  8. Agichtein, E., Gravano, L.: Snowball: extracting relations from large plaintext collections. In: Proceedings of the 5th ACM International Conference on Digital Libraries, pp. 85–94. San Antonio, Texas (2000)

    Google Scholar 

  9. Ludovic, L., Gallinari, P.: Bayesian network model for semi-structured document classification. Inf. Proc. Manage. Int. J. Spec. Issue Bayesian Netw. Inf. Retrieval 40, 807–827 (2004)

    Google Scholar 

  10. Rish, I.: An empirical study of the naive bayes classifier. In: Proceedings of IJCAI-01 Workshop on Empirical Methods in Artificial Intelligence (2001)

    Google Scholar 

  11. Jatana, N., Sharma, K.: Bayesian spam classification: time efficient radix encoded fragmented database approach. In: 2014 International Conference on Computing for Sustainable Global Development (INDIACom), pp. 939–942 (2014)

    Google Scholar 

  12. Aiwu, L., Hongying, L.: Utilizing improved bayesian algorithm to identify blog comment spam. In: IEEE Symposium on Robotics and Applications(ISRA), pp. 423–426 (2012)

    Google Scholar 

  13. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: ECML 1998 Proceedings of the 10th European Conference on Machine Learning, pp. 137–142. Springer-Verlag London, UK (1998)

    Google Scholar 

  14. Kleinbaum, D.G., Klein, M., Pryor, E.R.: Logistic Regression: A Self-Learning Text. Springer, New York (2002)

    MATH  Google Scholar 

  15. Baoli, L., Shiwen, Y., Qin, L.: An improved k-nearest neighbor algorithm for text categorization. In: The 20th International Conference on Computer Processing of Oriental Languages, Shenyang, China (2003)

    Google Scholar 

  16. Manne, S., Kotha, S. K., Fatima, S.: Text Categorization with k-nearest neighbor approach . In: Proceedings of the International Conference on Information Systems Design and Intelligent Applications, vol.132, pp. 413–420 (2012)

    Google Scholar 

  17. Entezari-Maleki, R., Rezaei, A., Minaei-Bidgoli, B.: Comparison of classification methods based on the type of attributes and sample size. J. Convergence Inf. Technol. (JCIT) 4(3), 94–102 (2009)

    Article  Google Scholar 

  18. Mooney, R.J., Bunescu, R.: Mining knowledge from text using information extraction. Newsl. ACM SIGKDD Explor. Newsl. Nat. Lang. Process. Text Min. 7(1), 3–10 (2005)

    Article  Google Scholar 

  19. Yahya, M., Whang, E.S., Gupta R., Halevy A.: ReNoun: fact extraction for nominal attributes. In: Proceedings of the Conference on Empirical Methods in Natural Language (EMNLP), pp. 325–335 (2014)

    Google Scholar 

  20. Luckicgev, S.: Graphical notations for rule modeling. In: Giurca, A., Gašević, D., Taveter, K. (eds.) Handbook of Research on Emerging Rule-Based Languages and Technologies: Open Solutions and Approaches, Hershey, New York., vol. 1, pp. 76–98 (2009)

    Google Scholar 

  21. Bondarenko, M.: Shabanov-Kushnarenko, J. 2007. The intelligence theory. Kharkiv: “SMIT”, 576. (In Russian)

    Google Scholar 

  22. Khairova, N., Sharonova, N., Gautam, A.P.: Logic-linguistic model of fact generation from text streams of corporate information system. Int. J. Inf. Theor. Appl. 22(2), 142–152 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nina Feliksivna Khairova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Khairova, N.F., Petrasova, S., Gautam, A.P.S. (2016). The Logical-Linguistic Model of Fact Extraction from English Texts. In: Dregvaite, G., Damasevicius, R. (eds) Information and Software Technologies. ICIST 2016. Communications in Computer and Information Science, vol 639. Springer, Cham. https://doi.org/10.1007/978-3-319-46254-7_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46254-7_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46253-0

  • Online ISBN: 978-3-319-46254-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics