Abstract
With the rapid development of information technology, various industries have to deal with an increasing number of data. Compared with the traditional static data, stream data under big data environment was rapid, continuous and always changed with time. At the same time, the implicit distribution of data stream brought about the concept drift. A stream data concept drift detection algorithm named ADDS (Anti-concept Drift Detection Algorithm) was put forward, which is mainly used to detect and process the hidden concept drift of unsteady data stream, under big data environment. The ADDS was focused on the improvements of traditional classification algorithms with incremental way to adapt to the demand of streaming data processing. The experimental results showed that the ADDS had a better concept drift detection effect.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Reed, D.A., Dongarra, J.: Exascale computing and big data. Commun. ACM 58(7), 56–68 (2015)
Assunção, M.D., Calheiros, R.N., Bianchi, S., et al.: Big data computing and clouds: Trends and future directions. J. Parallel Distrib. Comput. 75(5), 3–15 (2014)
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Data stream mining. In: Data Mining and Knowledge Discovery Handbook, pp. 759–787. Springer, Berlin (2009)
Lu, S., Xie, G., Chen, Z., et al.: The management of application of big data in internet of thing in environmental protection in China. In: IEEE First International Conference on Big Data Computing Service and Applications (BigDataService), pp. 218–222. IEEE (2015)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM, New York (2002)
Gama, J., Rocha, R., Medas, P.: Accurate decision trees for mining high-speed data streams. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 523–528. ACM, New York (2003)
Gama, J., Fernandes, R., Rocha, R.: Decision trees for mining data streams. Intell. Data Anal. 10(1), 23–45 (2006)
Anagnostopoulos, C., Tasoulis, D.K., Adams, N.M., et al.: Temporally adaptive estimation of logistic classifiers on data streams. Adv. Data Anal. Classif. 3(3), 243–261 (2009)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM, New York (2001)
Suzuki, Y., Kido, K.: Big-data streaming applications scheduling with online learning and concept drift detection. In: Proceedings of the Design, Automation & Test in Europe, pp. 1547–1550. IEEE, Piscataway (2015)
Kuncheva, L.I.: Classifier ensembles for changing environments. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 1–15. Springer, Heidelberg (2004). doi:10.1007/978-3-540-25966-4_1
Gama, J.: A survey on learning from data streams: current and future trends. Prog. Artif. Intell. 1(1), 45–55 (2012)
Chunquan, L., Yang, Z., Peng, S., et al.: Learning very fast decision tree from uncertain data streams with positive and unlabeled samples. Inf. Sci. 213(23), 50–67 (2012)
Wenhua, Z.: Constructing decision trees for mining high-speed data streams. Chin. J. Electron. 21(2), 215–220 (2012)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. Am. Stat. Assoc. 58(301), 13–30 (1963)
Acknowledgement
This paper was supported in part by project on the National Key Research and Development Program of China (2017YFB0202200); Program of National Natural Science Foundation of China (61373017, 61572261, 61170065); Outstanding Young Fund Project of Jiangsu Natural Science Foundation of China (BK20170100); Jiangsu Key Research and Development Program (BE2017166); Open-End Fund of Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks (WSNLBZY201514) and Research Project of Nanjing University of Posts and Telecommunications (NY214067).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd
About this paper
Cite this paper
Liu, S., Lu, L., Zhang, Y., Xin, T., Ji, Y., Wang, R. (2017). Research on Concept Drift Detection for Decision Tree Algorithm in the Stream of Big Data. In: Chen, G., Shen, H., Chen, M. (eds) Parallel Architecture, Algorithm and Programming. PAAP 2017. Communications in Computer and Information Science, vol 729. Springer, Singapore. https://doi.org/10.1007/978-981-10-6442-5_21
Download citation
DOI: https://doi.org/10.1007/978-981-10-6442-5_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6441-8
Online ISBN: 978-981-10-6442-5
eBook Packages: Computer ScienceComputer Science (R0)