Abstract
Due to an ever-increasing amount of data generated in healthcare each day, healthcare professionals are more and more challenged with information. Predictive models based on machine learning algorithms can help to quickly identify patterns in clinical data. Requirements for data driven decision support systems for health and care (DS4H) are similar in many ways to applications in other domains. However, there are also various challenges which are specific to health and care settings. The present paper describes a) healthcare specific requirements for DS4H and b) how they were addressed in our Predictive Analytics Toolset for Health and care (PATH). PATH supports the following process: objective definition, data cleaning and pre-processing, feature engineering, evaluation, result visualization, interpretation and validation and deployment. The current state of the toolset already allows the user to switch between the various involved levels, i. e. raw data (ECG), pre-processed data (averaged heartbeat), extracted features (QT time), built models (to classify the ECG into a certain rhythm abnormality class) and outcome evaluation (e. g. a false positive case) and to assess the relevance of a given feature in the currently evaluated model as a whole and for the individual decision. This allows us to gain insights as a basis for improvements in the various steps from raw data to decisions.
Funding statement: Parts of this work have been carried out with the support of different funding organisations, i. e. the K1 COMET Competence Center CBmed, which is funded by the Federal Ministry of Transport, Innovation and Technology (BMVIT); the Federal Ministry of Science, Research and Economy (BMWFW); Land Steiermark (Department 12, Business and Innovation); the Styrian Business Promotion Agency (SFG); and the Vienna Business Agency. The COMET program is executed by the FFG. We also thank SAP SE for their support.
About the authors
Dieter Hayn received his MSc in biomedical engineering from the TU Graz, his PhD from the Health and Life Science University Hall/Tyrol and his MBA from the MU Graz. He is currently working as a senior scientist at AIT. His research interests include data science, predictive modelling and biosignal processing. He is a co-editor of the eHealth20xx proceedings and (co-) author of numerous journal / conference papers.
Sai Veeranki obtained IT master’s degree from Alpen-Adria-Universität Klagenfurt in 2014 and graduated in health care information technology in 2016 from FH Kärnten. He is currently employed at AIT and working on his PhD “Predictive modeling in healthcare” in cooperation with KAGes and CBMed.
Martin Kropf is with the AIT since 2012. He received his MSc in eHealth from the FH Joanneum Graz in 2009 and is currently doing his PhD at the TU Graz. Since 2015, he is working as a data scientist and clinical project manager at the Charité Berlin.
Alphons Eggerth has been studying Biomedical Engineering at the Graz University of Technology and is currently working on his PhD thesis at the AIT Austrian Institute of Technology. His research interest is focused on data driven decision support based on time series data from telemonitoring settings.
Karl Kreiner is scientist and project manager at the AIT since 2003. He has more than 10 years of experience in tele-monitoring and machine learning applications. Karl Kreiner contributed to more than 30 national and international research and industry projects. He is author of various scientific publications.
Diether Kramer completed his studies in sociology and economics. Diether Kramer received his PhD from the University of Graz in 2013. Since 2007, he has worked at the University of Graz, then freelance for the Max Planck Institute for Demographic Research, as well as for the Wirtschaftsnachrichten and AVL List. From 2014 to 2015 he worked as a consultant for IMS-Health. Since the end of 2015 he is responsible for innovative data use at the KAGes.
Günter Schreier received the doctoral and Habilitation degrees in electrical engineering and biomedical informatics from the Graz University of Technology and is currently the thematic coordinator for “Predictive Healthcare Information Systems” with the AIT Austrian Institute of Technology. He serves as the President of the Austrian Society of Biomedical Engineering and the annual scientific eHealth conference in Vienna. He has (co-)authored 300+ scientific publications and presentations and advises the Austrian Ministry of Health and the European Commission.
References
1. J. Billings, I. Blunt, A. Steventon, T. Georghiou, G. Lewis, and M. Bardsley, “Development of a predictive model to identify inpatients at risk of re-admission within 30 days of discharge (PARR-30)”, (in eng), BMJ Open, vol. 2, no. 4, 2012.10.1136/bmjopen-2012-001667Search in Google Scholar PubMed PubMed Central
2. B. J. Mortazavi et al., “Analysis of Machine Learning Techniques for Heart Failure Readmissions”, (in eng), Circ Cardiovasc Qual Outcomes, vol. 9, no. 6, pp. 629–640, Nov 2016.10.1161/CIRCOUTCOMES.116.003039Search in Google Scholar PubMed PubMed Central
3. M. Shulan, K. Gao, and C. D. Moore, “Predicting 30-day all-cause hospital readmissions”, (in eng), Health Care Manag Sci, vol. 16, no. 2, pp. 167–175, Jun 2013.10.1007/s10729-013-9220-8Search in Google Scholar PubMed
4. Y. Xie, S. Neubauer, G. Schreier, S. Redmond, and N. Lovell, “Impact of Hierarchies of Clinical Codes on Predicting Future Days in Hospital”, in 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Milan, 2015, pp. 6852–6855: IEEE.Search in Google Scholar
5. Y. Xie et al., “Analyzing health insurance claims on different timescales to predict days in hospital”, (in eng), J Biomed Inform, Jan 2016.10.1016/j.jbi.2016.01.002Search in Google Scholar PubMed
6. Y. Xie et al., “Predicting Days in Hospital Using Health Insurance Claims”, (in eng), IEEE J Biomed Health Inform, vol. 19, no. 4, pp. 1224–1233, Jul 2015.10.1109/JBHI.2015.2402692Search in Google Scholar PubMed
7. Y. Xie et al., “Predicting Number of Hospitalization Days Based on Health Insurance Claims Data using Bagged Regression Trees”, in 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (Embc), 2014, pp. 2706–2709.Search in Google Scholar
8. K. Raja, M. Patrick, J. T. Elder, and L. C. Tsoi, “Machine learning workflow to enhance predictions of Adverse Drug Reactions (ADRs) through drug-gene interactions: application to drugs for cutaneous diseases”, (in eng), Sci Rep, vol. 7, no. 1, p. 3690, Jun 2017.10.1038/s41598-017-03914-3Search in Google Scholar PubMed PubMed Central
9. E. A. Voss, R. D. Boyce, P. B. Ryan, J. van der Lei, P. R. Rijnbeek, and M. J. Schuemie, “Accuracy of an automated knowledge base for identifying drug adverse reactions”, (in eng), J Biomed Inform, vol. 66, pp. 72–81, Feb 2017.10.1016/j.jbi.2016.12.005Search in Google Scholar PubMed PubMed Central
10. T. B. Ho, L. Le, D. T. Thai, and S. Taewijit, “Data-driven Approach to Detect and Predict Adverse Drug Reactions”, (in eng), Curr Pharm Des, vol. 22, no. 23, pp. 3498–3526, 2016.10.2174/1381612822666160509125047Search in Google Scholar PubMed
11. D. Hayn, A. Kollmann, and G. q. u. Schreier, “Automated QT Interval Measurement from Multilead ECG Signals”, in Comp Cardiol, 2006, vol. 33, pp. 381–384, publications\2006\2006-09-17-20_CinC_Valencia\DHa\Paper\0381.pdf.Search in Google Scholar
12. D. K. Kiely et al., “Persistent delirium predicts greater mortality”, (in eng), J Am Geriatr Soc, vol. 57, no. 1, pp. 55–61, Jan 2009.10.1111/j.1532-5415.2008.02092.xSearch in Google Scholar PubMed PubMed Central
13. G. Schreier, P. Kastner, W. Marko, and I. Ieee, “An automatic ECG processing algorithm to identify patients prone to paroxysmal atrial fibrillation”, in Computers in Cardiology 2001, vol. 28, Computers in Cardiology, 2001, pp. 133–135.Search in Google Scholar
14. D. Hayn, A. Kollmann, and G. Schreier, “Predicting initiation and termination of atrial fibrillation from the ECG”, (in eng), Biomed Tech (Berl), vol. 52, no. 1, pp. 5–10, Feb 2007.10.1515/BMT.2007.003Search in Google Scholar PubMed
15. M. Vukovic, M. Drobics, K. Kreiner, D. Hayn, and G. Schreier, “Alarm Management in Patient Health Status Monitoring”, in Ehealth2012 – Health Informatics Meets Ehealth – Von Der Wissenschaft Zur Anwendung Und Zuruck: Mobile Health & Care – Gesundheitsvorsorge Immer Und Uberall, pp. bfpage39–44, 2012.Search in Google Scholar
16. J. Morak, D. Hayn, P. Kastner, M. Drobics, and G. Schreier, “Near Field Communication technology as the key for data acquisition in clinical research”, in First International Workshop on near Field Communication, Proceedings, 2009, pp. 15–19.10.1109/NFC.2009.12Search in Google Scholar
17. G. Schreier et al., “A Mobile-Phone based Teledermatology System to support Self-Management of Patients suffering from Psoriasis”, in 30th Annual International Conference of the Ieee Engineering in Medicine and Biology Society, Vols 1–8, IEEE Engineering in Medicine and Biology Society Conference Proceedings, 2008, pp. 5338–5341.10.1109/IEMBS.2008.4650420Search in Google Scholar
18. J. Morak, P. Kastner, D. Hayn, A. Kollmann, and G. Schreier, “Evaluation of a Patient-Terminals Based on Mobile and Near Field Communication Technology”, in Ehealth2008 – Medical Informatics Meets Ehealth, pp. 73–79, 2008.Search in Google Scholar
19. G. Schreier et al., “Automated and manufacturer independent assessment of the battery status of implanted cardiac pacemakers by electrocardiogram analysis”, in Proceedings of the 26th Annual International Conference of the Ieee Engineering in Medicine and Biology Society, Vols 1–7, Proceedings of Annual International Conference of the Ieee Engineering in Medicine and Biology Society, 2004, pp. 76–79.Search in Google Scholar
20. M. Vukovic, M. Drobics, D. Hayn, K. Kreiner, and G. Schreier, “Automated Alarm Management System for Home Telemonitoring of Chronic Heart Failure Patients”, in Abstractbook of the ICICTH 20112 Conference. 12–14 Jul 2012; Samos, Greece, 2012, p. 14, publications\2012\2012-07-12_ICICTH_Samos\MVu_final\vukovic2012.pdf: Research and Training Institute of East Aegean, Greece.Search in Google Scholar
21. D. Hayn, B. Jammerbund, G. Schreier, and IEEE, “ECG Quality Assessment for Patient Empowerment in mHealth Applications”, in 2011 Computing in Cardiology, 2011, pp. 353–356.Search in Google Scholar
22. M. Vukovic et al., “Weather Influence on Alarm Occurrence in Home Telemonitoring of Heart Failure Patients”, in 2012 Computing in Cardiology (Cinc), Vol 39, 2012, pp. 525–528.Search in Google Scholar
23. D. Kramer et al., “Development and Validation of a Multivariable Prediction Model for the Occurrence of Delirium in Hospitalized Gerontopsychiatry and Internal Medicine Patients”, (in eng), Stud Health Technol Inform, vol. 236, pp. 32–39, 2017.Search in Google Scholar
24. D. Hayn et al., “Development of Multivariable Models to Predict and Benchmark Transfusion in Elective Surgery Supporting Patient Blood Management”, Applied Clinical Informatics, vol. 8, no. 2, pp. 617–631, 2017.10.4338/ACI-2016-11-RA-0195Search in Google Scholar
25. D. Hayn et al., “Data Driven Methods for Predicting Blood Transfusion Needs in Elective Surgery”, (in eng), Stud Health Technol Inform, vol. 223, pp. 9–16, 2016.Search in Google Scholar
26. D. Gotz and D. Borland. “Data-Driven Healthcare: Challenges and Opportunities for Interactive Visualization”, IEEE Computer Graphics and Applications, vol. 36, no. 3, pp. 90–96, 2016.10.1109/MCG.2016.59Search in Google Scholar
27. The European Parliament and of the Council, “Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation)”, 2016.Search in Google Scholar
28. WHO. International Statistical Classification of Diseases and Related Health Problems 10th Revision. 2016. Available from http://apps.who.int/classifications/icd10/browse/2016/en. Last visited: 31.12.2017.Search in Google Scholar
29. WHO. International Classification of Health Interventions. Draft in Development 2015. Available from http://www.who.int/classifications/ichi/en/. Last visited: 31.12.2017.Search in Google Scholar
30. M. E. Charlson, P. Pompei, K. L. Ales, C. R. MacKenzie, “A new method of classifying prognostic comorbidity in longitudinal studies: development and validation”, J Chronic Dis., vol. 40, pp. 373–383, 1987.10.1016/0021-9681(87)90171-8Search in Google Scholar
31. D. W. Bates et al., “Ten commandments for effective clinical decision support: Making the practice of evidence-based medicine a reality”, (in eng), Journal of the American Medical Informatics Association, vol. 10, no. 6, pp. 523–530, Nov–Dec 2003.10.1197/jamia.M1370Search in Google Scholar PubMed PubMed Central
32. M. Kropf, D. Hayn, G. Schreier. ECG classification based on time and frequency domain features using random forests. Computing in Cardiology, Rennes (F); 2017.10.22489/CinC.2017.168-168Search in Google Scholar
33. A. Guazzelli. Representing predictive solutions in PMML. Move from raw data to predictions. IBM developerWorks. 2010. Available from https://www.ibm.com/developerworks/library/ba-ind-PMML2/ba-ind-PMML2-pdf.pdf Last visited: 31.12.2017.Search in Google Scholar
34. Max Kuhn. Contributions from Jed Wing, Steve Weston, Andre Williams, Chris Keefer, Allan Engelhardt, Tony Cooper, Zachary Mayer, Brenton Kenkel, the R Core Team, Michael Benesty, Reynald Lescarbeau, Andrew Ziem, Luca Scrucca, Yuan Tang and Can Candan, (2016), caret: Classification and Regression Training. R package version 6.0-68. https://CRAN.R-project.org/package=caret.Search in Google Scholar
35. D. Hayn, H. Walch, J. Stieg, K. Kreiner, H. Ebner, and G. Schreier, “Plausibility of Individual Decisions from Random Forests in Clinical Predictive Modelling Applications”, (in eng), Stud Health Technol Inform, vol. 236, pp. 328–335, 2017.Search in Google Scholar
© 2018 Walter de Gruyter GmbH, Berlin/Boston