Abstract
There has been increasing popularity in medical text mining due to its vast applications in the field of disease prediction and clinical Recommendation systems. Radiology reports possess rich information depicting radiologists investigations on the health conditions of the patients in associated radiology images. However, radiology reports exist in a free-text unstructured format consisting of valuable information for disease prediction. This information cannot be easily retrieved and utilized for prediction without suitable text mining techniques. The medical dataset available in the current procedure is small, domain-specific and restricted to the institution. However, data is one of the critical factors to power Machine Learning (ML) and Deep Learning (DL) models. To overcome the above challenge of predicting disease in the low data condition, we present a practical Deep Learning framework that combines a Knowledge Base (KB) with the Deep Learning for accurate text mining and predicting the lung diseases from the unstructured radiology free-text reports. We adopt Glove word embeddings with the KB trained on large corpus for effective text modelling. Further, we incorporate Convolutional Neural Network-based Discriminative Dimensionality Reduction (CNN-DDR) to obtain the most discriminative feature vector. Finally, a fully connected Deep Neural Network (DNN) is leveraged as the prediction model to detect the diseases. We applied the proposed framework to predict the lung diseases on radiology reports from both publicly available Indiana University (IU) dataset [6] and data collected from the private hospital. We benchmark the performance of the proposed framework, which outperforms against the standard ML Techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Agarap, A.F.: Deep learning using rectified linear units (relu). CoRR abs/1803.08375, March 2018
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Cao, Y., Huang, L., Ji, H., Chen, X., Li, J.: Bridge text and knowledge by learning multi-prototype entity mention embedding, pp. 1623–1633. ACL (2017)
Castro, S., Tseytlin, E., Medvedeva, O., Mitchell, K., Visweswaran, S., Bekhuis, T., Jacobson, R.: Automated annotation and classification of BI-RADS assessment from radiology reports. J. Biomed. Inform. 69, 177–187 (2017)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.P.: Natural language processing (almost) from scratch. CoRR (2011)
Demner-Fushman, D., Kohli, M.D., Rosenman, M.B., Shooshan, S.E., Rodriguez, L., Antani, S.K., Thoma, G.R., McDonald, C.J.: Preparing a collection of radiology examinations for distribution and retrieval. JAMIA 23(2), 304–10 (2016)
Dutta, S., Long, W.J., Brown, D.F., Reisner, A.T.: Automated detection using natural language processing of radiologists recommendations for additional imaging of incidental findings. Ann. Emerg. Med. 62(2), 162–169 (2013)
Flach, P.: Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge University Press, Cambridge (2012)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2016)
Hassanpour, S., Bay, G., Langlotz, C.: Characterization of change and significance for clinical findings in radiology reports through natural language processing. J. Digit. Imaging 30, 314–322 (2017)
Liu, G., Hsu, T.H., McDermott, M.B.A., Boag, W., Weng, W., Szolovits, P., Ghassemi, M.: Clinically accurate chest x-ray report generation. CoRR (2019)
Santos, I., Nedjah, N., Mourelle, L.: Sentiment analysis using convolutional neural network with fastText embeddings, pp. 1–5, November 2017. https://doi.org/10.1109/LA-CCI.2017.8285683
Patel, K., Patel, D., Golakiya, M., Bhattacharyya, P., Birari, N.: Adapting pre-trained word embeddings for use in medical coding, pp. 302–306. ACL (2017)
Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation, pp. 1532–1543. Association for Computational Linguistics, October 2014
Trivedi, H., Mesterhazy, J., Laguna, B., Vu, T., Sohn, J.: Automatic determination of the need for intravenous contrast in musculoskeletal MRI examinations using IBM Watson’s natural language processing algorithm. J. Digit. Imaging 31(2), 245–251 (2017)
Xue, Y., Xu, T., Rodney Long, L., Xue, Z., Antani, S., Thoma, G.R., Huang, X.: Multimodal recurrent model with attention for automated radiology report generation, pp. 457–466. Springer International Publishing, Cham (2018)
Zhang, Y., Ding, D.Y., Qian, T., Manning, C.D., Langlotz, C.P.: Learning to summarize radiology findings. CoRR (2018)
Acknowledgement
We thank the Department of Information Technology, NITK Surathkal for providing resources for this research. We are grateful to KMC Hospital, Mangalore for granting access to the de-identified radiology reports for this research (Ref. IEC KMC MLR 01-2020/80 dated 15.01.2020).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
Rights and permissions
Copyright information
© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Shetty, S., Ananthanarayana, V.S., Mahale, A. (2020). Medical Knowledge-Based Deep Learning Framework for Disease Prediction on Unstructured Radiology Free-Text Reports Under Low Data Condition. In: Iliadis, L., Angelov, P., Jayne, C., Pimenidis, E. (eds) Proceedings of the 21st EANN (Engineering Applications of Neural Networks) 2020 Conference. EANN 2020. Proceedings of the International Neural Networks Society, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-030-48791-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-48791-1_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-48790-4
Online ISBN: 978-3-030-48791-1
eBook Packages: Computer ScienceComputer Science (R0)