An Interpretable Conditional Augmentation Classification Approach for Imbalanced EHRs Mortality Prediction | SpringerLink
Skip to main content

An Interpretable Conditional Augmentation Classification Approach for Imbalanced EHRs Mortality Prediction

  • Conference paper
  • First Online:
Data Mining and Big Data (DMBD 2022)

Abstract

One of the most crucial tasks in the ICU is mortality prediction. The number of deceased patients is significantly lower than the number of survivors, and it is simple to over-identify the survivors. Additionally, the clinical use of present machine learning and deep learning models is challenging due to their lack of interpretability. To address the aforementioned issues, we innovatively propose the Interpretable Conditional Augmentation Classification (ICAC) method. By using CWGAN to create balanced samples, ICAC learns the distribution of minor samples. In order to make better clinical suggestions, the Shapley value is utilized to examine the marginal contribution of patient characteristics to the prediction model. We test the model on the latest released MIMIC-IV, and the experimental results show that the AUC index of our model is superior than that of the basic model. Our proposed method can successfully address the class imbalance issue in EHRs, clarify how features affect model outcomes, and offer useful recommendations for clinical practice.

Supported by National Key R&D Program of China (2018AAA0101003) and National Natural Science Foundation of China (Grant No. 71901050).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 10295
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12869
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agrawal., D., et al.: Challenges and opportunities with big data. Cyber Center Technical Reports (White Paper 1) (2012)

    Google Scholar 

  2. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015). https://doi.org/10.1371/journal.pone.0130140

    Article  Google Scholar 

  3. Baowaly, M.K., Lin, C., Liu, C., Chen, K.: Synthesizing electronic health records using improved generative adversarial networks. J. Am. Med. Inform. Assoc. 26(3), 228–241 (2019)

    Google Scholar 

  4. Baumann, L.C., Ylinen, A.: Electronic Health Record, pp. 744–745. Springer International Publishing, Cham (2020)

    Google Scholar 

  5. Caicedo-Torres, W., Gutierrez, J.: ISeeU: visually interpretable deep learning for mortality prediction inside the ICU. J. Biomed. Inform. 98, 103269 (2019). https://doi.org/10.1016/j.jbi.2019.103269

    Article  Google Scholar 

  6. Che, Z., Purushotham, S., Khemani, R., Yan, L.: Interpretable deep models for ICU outcome prediction. In: AMIA Annual Symposium Proceedings/AMIA Symposium. AMIA Symposium 2016, pp. 371–380 (2016)

    Google Scholar 

  7. Che, Z., Cheng, Y., Zhai, S., Sun, Z., Liu, Y.: Boosting deep learning risk prediction with generative adversarial networks for electronic health records. In: Raghavan, V., Aluru, S., Karypis, G., Miele, L., Wu, X. (eds.) 2017 IEEE International Conference on Data Mining, ICDM 2017, New Orleans, LA, USA, 18–21 November 2017, pp. 787–792. IEEE Computer Society (2017)

    Google Scholar 

  8. Devarriya, D., Gulati, C., Mansharamani, V., Sakalle, A., Bhardwaj, A.: Unbalanced breast cancer data classification using novel fitness functions in genetic programming. Expert Syst. Appl. 140, 112866 (2020)

    Article  Google Scholar 

  9. Fotouhi, S., Asadi, S., Kattan, M.W.: A comprehensive data level analysis for cancer diagnosis on imbalanced data. J. Biomed. Inform. 90, 103089 (2019)

    Google Scholar 

  10. Goodfellow, I.J., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp. 2672–2680 (2014)

    Google Scholar 

  11. Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi, L., Mark, R.: Mimic-iv (version 1.0) (2020)

    Google Scholar 

  12. Alghatani, K., Ammar, N., Rezgui, A., Shaban-Nejad, A.: Predicting intensive care unit length of stay and mortality using patient vital signs: machine learning model development and validation. JMIR Med. Inform. 9(5), e21347 (2021)

    Google Scholar 

  13. Li, T.H., Wang, Z.S., Lu, W., Zhang, Q., Li, D.F.: Electronic health records based reinforcement learning for treatment optimizing. Inf. Syst. 104(3), 101878 (2021)

    Google Scholar 

  14. Lipton, Z.C.: The mythos of model interpretability. Commun. ACM 61(10), 36–43 (2018)

    Article  Google Scholar 

  15. Lundberg, S.M., et al.: Explainable machine-learning predictions for the prevention of Hypoxaemia during surgery. Nature Biomed. Eng. 2(10), 749–760 (2018)

    Google Scholar 

  16. Mirza, M., Osindero, S.: Conditional generative adversarial nets. CoRR abs/1411.1784 (2014). arxiv:1411.1784

  17. Nowroozilarki, Z., Pakbin, A., Royalty, J., Lee, D.K., Mortazavi, B.J.: Real-time mortality prediction using mimic-iv ICU data via boosted nonparametric hazards. In: 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), pp. 1–4 (2021)

    Google Scholar 

  18. Poucke, S.V., Gayle, A.A., Vukicevic, M.: Secondary analysis of electronic health records in critical care medicine. Ann. Transl. Med. 6(3), 52 (2017)

    Google Scholar 

  19. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). arxiv:1511.06434

  20. Ramponi, G., Protopapas, P., Brambilla, M., Janssen, R.: T-CGAN: conditional generative adversarial network for data augmentation in noisy time series with irregular sampling. CoRR abs/1811.08295 (2018)

    Google Scholar 

  21. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?": explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. KDD 2016. Association for Computing Machinery, New York, NY, USA (2016)

    Google Scholar 

  22. Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Mach. Intell. 1(5), 206–215 (2019)

    Article  Google Scholar 

  23. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017. Proceedings of Machine Learning Research, vol. 70, pp. 3145–3153. PMLR (2017)

    Google Scholar 

  24. Si, Y., et al.: Deep representation learning of patient data from electronic health records (EHR): a systematic review. J. Biomed. Inform. 115, 103671 (2021)

    Google Scholar 

  25. Strumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41(3), 647–665 (2014)

    Article  Google Scholar 

  26. Xu, Y., Biswal, S., Deshpande, S.R., Maher, K.O., Sun, J.: RAIM: recurrent attentive and intensive model of multimodal patient monitoring data. In: Guo, Y., Farooq, F. (eds.) Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, 19–23 August 2018, pp. 2565–2573. ACM (2018)

    Google Scholar 

  27. Xu, Z., Shen, D., Nie, T., Kou, Y.: A hybrid sampling algorithm combining m-smote and ENN based on random forest for medical imbalanced data. J. Biomed. Inform. 107, 103465 (2020)

    Article  Google Scholar 

  28. Ye, J., Yao, L., Shen, J., Janarthanam, R., Luo, Y.: Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes. BMC Med. Inform. Dec. Making 20(Suppl 11), 295 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dengfeng Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, T., Yin, N., Gao, P., Li, D., Lu, W. (2022). An Interpretable Conditional Augmentation Classification Approach for Imbalanced EHRs Mortality Prediction. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2022. Communications in Computer and Information Science, vol 1744. Springer, Singapore. https://doi.org/10.1007/978-981-19-9297-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-9297-1_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-9296-4

  • Online ISBN: 978-981-19-9297-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics