Abstract
Data Stream represents a significant challenge for data analysis and data mining techniques because those techniques are developed based on training batch data. Classification technique that deals with data stream should have the ability for adapting its model for the new samples and forget the old ones. In this paper, we present an intensive comparison for the performance of six of popular classification techniques and focusing on the power of Adaptive Random Forest. The comparison was made based on four real medical datasets and for more reliable results, 40 other datasets were made by adding white noise to the original datasets. The experimental results showed the dominant of Adaptive Random Forest over five other techniques with high robustness against the change in data and noise.
A. Kiss was also with J. Selye University, Komarno, Slovakia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM (2001)
Gantz, J., Reinsel, D.: The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. IDC iView: IDC Analyze the future 2007(2012), 1–16 (2012)
Gama, J.: Knowledge Discovery from Data Streams. Chapman and Hall/CRC, London (2010)
Krempl, G., et al.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 16(1), 1–10 (2014)
Babenko, B., Yang, M.-H., Belongie, S.: A family of online boosting algorithms. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp. 1346–1353. IEEE (2009)
Bifet, A., Holmes, G., Pfahringer, B., Gavaldà, R.: Improving adaptive bagging methods for evolving data streams. In: Zhou, Z.-H., Washio, T. (eds.) ACML 2009. LNCS (LNAI), vol. 5828, pp. 23–37. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05224-8_4
Fong, S., et al.: Stream-based biomedical classification algorithms for analyzing biosignals. J. Inf. Process. Syst. 7(4), 717–732 (2011)
Hang, Y., et al.: Case-based and stream-based classification in biomedical application. In: Eighth IASTED International Conference on Biomedical Engineering (Biomed 2011), pp. 207–214. February 2011
Zhang, Y., et al.: Real-time clinical decision support system with data stream mining. In: BioMed Research International 2012 (2012)
Cazzolato, M.T., Ribeiro, M.X.: A statistical decision tree algorithm for medical data stream mining. In: Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems, pp. 389–392. IEEE (2013)
Zhu, M., et al.: Class weights random forest algorithm for processing class imbalanced medical data. IEEE Access 6, 4641–4652 (2018)
Al-Shammari, A., Zhou, R., Liu, C., Naseriparsa, M., Vo, B.Q.: A framework for processing cumulative frequency queries over medical data streams. In: Hacid, H., Cellary, W., Wang, H., Paik, H.-Y., Zhou, R. (eds.) WISE 2018. LNCS, vol. 11234, pp. 121–131. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02925-8_9
Oza, N.C.: Online bagging and boosting. In: 2005 IEEE International Conference on Systems, Man and Cybernetics. vol. 3, pp. 2340–2345, IEEE (2005)
Losing, V., Hammer, B., Wersing, H.: KNN classifier with self adjusting memory for heterogeneous concept drift. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 291–300. IEEE (2016)
Salperwyck, C., Lemaire, V., Hue, C.: Incremental weighted naive bays classifiers for data stream. In: Lausen, B., Krolak-Schwerdt, S., Böhmer, M. (eds.) Data Science, Learning by Latent Structures, and Knowledge Discovery. SCDAKO, pp. 179–190. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-44983-7_16
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Kdd. vol. 2, p. 4 (2000)
Irvine UC: Machine Learning Repository. July 2019. url: https://archive.ics.uci.edu/ml/index.php
kaggle Rebosotiry: Public Datasets. July 2019. url: https://www.kaggle.com/datasets
Acknowledgment
The project was supported by the European Union, co-financed by the European Social Fund (EFOP-3.6.3-VEKOP-16-2017-00002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Fatlawi, H.K., Kiss, A. (2020). On Robustness of Adaptive Random Forest Classifier on Biomedical Data Stream. In: Nguyen, N., Jearanaitanakij, K., Selamat, A., Trawiński, B., Chittayasothorn, S. (eds) Intelligent Information and Database Systems. ACIIDS 2020. Lecture Notes in Computer Science(), vol 12033. Springer, Cham. https://doi.org/10.1007/978-3-030-41964-6_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-41964-6_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41963-9
Online ISBN: 978-3-030-41964-6
eBook Packages: Computer ScienceComputer Science (R0)