Abstract
Data stream mining is an active research area that has attracted the attention of many researchers in the machine learning community. Discovering knowledge from large amounts of continuously generated data from online services and real time applications constitute a challenging task for data analytics where robust and efficient online algorithms are required. This paper presents a novel method for data stream mining. In particular, two main challenges of data stream processing are addressed, namely, concept drift and feature evolution in textual data streams. To address these issues, the proposed method uses the Artificial Immune System metaheuristic. AIS has powerful adapting capabilities which make it robust even in changing environments. Our proposed algorithm AIS-Clus has the ability to adapt its model to handle concept drift and feature evolution for textual data streams. Experimental results have been performed on textual dataset where efficient and promising results are obtained.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dorigo, M., Colorni, A., Maniezzo, V.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
De Castro, L.N., Von Zuben, F.J.: Learning and optimization using the clonal selection principle. IEEE Trans. Evol. Comput. 6(3), 239–251 (2002)
John, H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence. MIT Press, Cambridge (1992)
Grigorios, T., Ioannis, K., Ioannis, V.: Dynamic feature space and incremental feature selection for the classification of textual data streams. In: International Workshop on Knowledge Discovery from Data Streams, ECML/PKDD-2006, p. 107. Springer (2006)
Norman, H., Packard, J., Doyne, F., Alan, S.: The immune system, adaptation and machine learning. Physica D 22, 187–204 (1986)
Delany, S., Jane, L., Namee, B.: Handling concept drift in a text data stream constrained by high labelling cost. In: FLAIRS Conference. AAAI Press (2010)
Masud, M., Chen, Q., Khan, L., Aggarwal, C., Gao, J., Han, J., Srivastava, N., Oza, C.: Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans. Knowl. Data Eng. 25(7), 1484–1497 (2013)
Nasraoui, O., Uribe, C., Gonzalez, F.: Tecno-streams: tracking evolving clusters in noisy data streams with a scalable immune system learning model. In: Proceedings of the Third IEEE International Conference on Data Mining, ICDM 2003, Washington, DC, p. 235. IEEE Computer Society (2003)
Sergio, R., Bartosz, K., Salvador, G., Michał, W., Francisco, H.: A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239, 39–57 (2017)
Kuo, R.J., Chen, S., Cheng, W., Tsai, C.: Integration of artificial immune network and K-means for cluster analysis. Appl. Artif. Intell. 40(3), 541–557 (2013)
Lawrence, A., Stephanie, F., Alan, S., Rajesh, C.: Self-nonself discrimination in a computer. In: Proceedings of the 1994 IEEE Symposium on Research in Security and Privacy, Los Alamitos. IEEE Computer Society (1994)
Jon, T., Thomas, K.: Artificial immune systems: using the immune system as inspiration for data mining. In: Data Mining: A Heuristic Approach, Chapter XI, pp. 209–230. Group Idea Publishing, September 2001
Yanmin, Z., Shuai, C., Tinggui, C.: K-means clustering method based on artificial immune system in scientific research project management in universities. Int. J. Comput. Sci. Math. 8(2), 129–137 (2017)
Žliobaitė, I., Pechenizkiy, M., Gama, J.: An overview of concept drift applications. In: Japkowicz, N., Stefanowski, J. (eds.) Big Data Analysis: New Algorithms for a New Society. SBD, vol. 16, pp. 91–114. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-26989-4_4
Jerne, N.: Towards a network theory of the immune system. Ann. Immunol. 125, 373–389 (1974)
Acknowledgment
This paper was made possible by NPRP grant #9-175-033 from the Qatar National Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Abid, A., Jamoussi, S., Hamadou, A.B. (2018). Handling Concept Drift and Feature Evolution in Textual Data Stream Using the Artificial Immune System. In: Nguyen, N., Pimenidis, E., Khan, Z., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2018. Lecture Notes in Computer Science(), vol 11055. Springer, Cham. https://doi.org/10.1007/978-3-319-98443-8_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-98443-8_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98442-1
Online ISBN: 978-3-319-98443-8
eBook Packages: Computer ScienceComputer Science (R0)