Abstract
A lot of domain-specific unstructured data is available at present. To make them available to common users, domain experts often have to extract the key points and convert them to layman’s terms manually. For domains like legal, documents are often needed to be manually analyzed in order to check if all the critical information is present in them and to extract the important points if needed. All these manual domain-specific tasks can be automated with the help of different Natural Language Processing (NLP) and Natural Language Generation (NLG) techniques. In this paper, some of the tools in NLP and NLG that can be used to automate the above-mentioned processes for key information extraction are discussed. We also bring forth two such domain-specific use cases where we attempt to provide suggestions to the subject experts to make their tasks easier using the tools discussed.
Snigdha Biswas and Jahnvi Gupta These authors made equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abrahamsson, E., Forni, T., Skeppstedt, M., Kvist, M.: Medical text simplification using synonym replacement: adapting assessment of word difficulty to a compounding language. In: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), pp. 57–65 (2014)
Al-Thanyyan, S.S., Azmi, A.M.: Automated text simplification: a survey. ACM Comput. Surv. (CSUR) 54(2), 1–36 (2021)
Bolshakov, I.A., Gelbukh, A.: Synonymous paraphrasing using wordnet and internet. In: International Conference on Application of Natural Language to Information Systems, pp. 312–323. Springer (2004)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding (2018). arXiv:1810.04805
Ek, T., Kirkegaard, C., Jonsson, H., Nugues, P.: Named entity recognition for short text messages. Proceedia Soc. Behav. Sci. 27, 178–187 (2011)
Goyal, A., Gupta, V., Kumar, M.: Recent named entity recognition and classification techniques: a systematic review. Comput. Sci. Rev. 29, 21–43 (2018)
Gupta, A., Agarwal, A., Singh, P., Rai, P.: A deep generative framework for paraphrase generation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Kauchak, D., Barzilay, R.: Paraphrasing for automatic evaluation. In: Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, pp. 455–462 (2006)
Loper, E., Bird, S.: Nltk: the natural language toolkit (2002). arXiv:cs/0205028
Moramarco, F., Juric, D., Savkov, A., Flann, J., Lehl, M., Boda, K., Grafen, T., Zhelezniak, V., Gohil, S., Korfiatis, A.P., et al.: Towards more patient friendly clinical notes through language models and ontologies. In: AMIA Annual Symposium Proceedings, vol. 2021, p. 881. American Medical Informatics Association (2021)
Nisioi, S., Štajner, S., Ponzetto, S.P., Dinu, L.P.: Exploring neural text simplification models. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (volume 2: Short papers), pp. 85–91 (2017)
Prakash, A., Hasan, S.A., Lee, K., Datla, V., Qadir, A., Liu, J., Farri, O.: Neural paraphrase generation with stacked residual LSTM networks (2016). arXiv:1610.03098
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020)
Shardlow, M.: A survey of automated text simplification. Int. J. Adv. Comput. Sci. Appl. 4(1), 58–70 (2014)
Shardlow, M., Nawaz, R.: Neural text simplification of clinical letters with a domain specific phrase table (2019)
Shen, D., Zhang, J., Zhou, G., Su, J., Tan, C.L.: Effective adaptation of hidden markov model-based named entity recognizer for biomedical domain. In: Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, pp. 49–56 (2003)
Štajner, S., Nisioi, S.: A detailed evaluation of neural sequence-to-sequence models for in-domain and cross-domain text simplification. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (2018)
Sulem, E., Abend, O., Rappoport, A.: Simple and effective text simplification using semantic and neural methods (2018). arXiv:1810.05104
Van den Bercken, L., Sips, R.-J., Lofi, C.: Evaluating neural text simplification in the medical domain. In: The World Wide Web Conference, pp. 3286–3292 (2019)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Wubben, S., Van Den Bosch, A., Krahmer, E.: Paraphrase generation as monolingual translation: data and evaluation. In: Proceedings of the 6th International Natural Language Generation Conference (2010)
Wubben, S., Van Den Bosch, A., Krahmer, E.: Sentence simplification by monolingual machine translation. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1015–1024 (2012)
Wei, X., Napoles, C., Pavlick, E., Chen, Q., Callison-Burch, C.: Optimizing statistical machine translation for text simplification. Trans. Assoc. Comput. Linguist. 4, 401–415 (2016)
Yan, H., Deng, B., Li, X., Qiu, X.: Tener: adapting transformer encoder for named entity recognition (2019). arXiv:1911.04474
Yang, X., Bian, J., Hogan, W.R., Wu, Y.: Clinical concept extraction using transformers. J. Am. Med. Inform. Assoc. 27(12), 1935–1942 (2020)
Zhang, J., Zhao, Y., Saleh, M., Liu, P.: Pegasus: pre-training with extracted gap-sentences for abstractive summarization. In: International Conference on Machine Learning, pp. 11328–11339. PMLR (2020)
Zhao, S., Niu, C., Zhou, M., Liu, T., Li, S.: Combining multiple resources to improve SMT-based paraphrasing model. In: Proceedings of ACL-08: HLT, pp. 1021–1029 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Varma, S., Shivam, S., Natarajan, S., Biswas, S., Gupta, J. (2024). Taking Natural Language Generation and Information Extraction to Domain Specific Tasks. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2023. Lecture Notes in Networks and Systems, vol 824. Springer, Cham. https://doi.org/10.1007/978-3-031-47715-7_48
Download citation
DOI: https://doi.org/10.1007/978-3-031-47715-7_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47714-0
Online ISBN: 978-3-031-47715-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)