Abstract
Many text classification problems in social networks, and other contexts, are also dynamic problems, where concepts drift through time, and meaningful labels are dynamic. In Twitter-based applications in particular, ensembles are often applied to problems that fit this description, for example sentiment analysis or adapting to drifting circumstances. While it can be straightforward to request different classifiers' input on such ensembles, our goal is to boost dynamic ensembles by combining performance metrics as efficiently as possible. We present a twofold performance-based framework to classify incoming tweets based on recent tweets. On the one hand, individual ensemble classifiers' performance is paramount in defining their contribution to the ensemble. On the other hand, examples are actively selected based on their ability to effectively contribute to the performance in classifying drifting concepts. The main step of the algorithm uses different performance metrics to determine both each classifier strength in the ensemble and each example importance, and hence lifetime, in the learning process. We demonstrate, on a drifted benchmark dataset, that our framework drives the classification performance considerably up for it to make a difference in a variety of applications.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ahsan MI, Nahian T, Kafi AA, Hossain MI, Shah FM (2016) Review spam detection using active learning. In: Proceedings of the 7th annual conference on information technology, electronics and mobile communication, pp 1–7
Almeida PR, Oliveira LS, Britto AS, Sabourin R (2018) Adapting dynamic classifier selection for concept drift. Expert Syst Appl 104:67–85
Bagul RD, Phulpagar BD (2016) Survey on approaches, problems and applications of ensemble of classifiers. Int J Emerg Trends Technol Comput Sci 5(1):28–30
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Costa J, Silva C, Antunes M, Ribeiro B (2013) Defining semantic meta-hashtags for Twitter classification. In: Proceedings of the 11th international conference on adaptive and natural computing algorithms, pp 226–235
Costa J, Silva C, Antunes M, Ribeiro B (2015a) Dots: drift oriented tool system. In: Proceedings of the 22nd international conference on neural information processing (ICONIP), pp 615–623
Costa J, Silva C, Antunes M, Ribeiro B (2015b) The impact of longstanding messages in micro-blogging classification. In: International joint conference on neural networks (IJCNN), pp 1–8
Costa J, Silva C, Antunes M, Ribeiro B (2016) Choice of best samples for building ensembles in dynamic environments. In: Engineering applications of neural networks, pp 35–47
Costa J, Silva C, Antunes M, Ribeiro B (2017a) Adaptive learning for dynamic environments: a comparative approach. Eng Appl Artif Intell 65:336–345
Costa J, Silva C, Antunes M, Ribeiro B (2017b) Performance metrics for model fusion in twitter data drifts. In: Proceedings of the 8th Iberian conference on pattern recognition and image analysis, pp 13–21
Costa J, Silva C, Antunes M, Ribeiro B (2018) Adaptive learning models evaluation in twitter’s timelines. In: International joint conference on neural networks (IJCNN)
Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Intell Mag 10(4):12–25
Doerr B, Fouz M, Friedrich T (2012) Why rumors spread so quickly in social networks. Commun ACM 55(6):70–75
Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531
Faris H, Al-Zoubi AM, Heidari AA, Aljarah I, Mafarja M, Hassonah MA, Fujita H (2019) An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf Fusion 48:67–83
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Haque ME, Alkharobi TM (2015) Adaptive hybrid model for network intrusion detection and comparison among machine learning algorithms. Int J Mach Learn Comput 5(1):17
Huang J, Tang Y, Hu Y, Li J, Hu C (2019) Predicting the active period of popularity evolution: a case study on Twitter hashtags. Inf Sci. https://doi.org/10.1016/j.ins.2019.04.028
Joachims T (2002) Learning to classify text using support vector machines: methods, theory and algorithms. Kluwer Academic Publishers, Norwell, MA, USA
Karnick M, Muhlbaier MD, Polikar R (2008) Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: International conference on pattern recognition, pp 1–4
Kim J, Bentley P, Aickelin U, Greensmith J, Tedesco G, Twycross J (2007) Immune system approaches to intrusion detection—a review. Natural Comput 6(4):413–466
Kolter JZ, Maloof MA (2003) Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Proceedings of the 3rd IEEE international conference on data mining, p 123
Kuncheva L (2002) A theoretical study on six classifier fusion strategies. IEEE Tran Pattern Anal Mach Intell 24(2):281–286
Meng J, Peng W, Tan PN, Liu W, Cheng Y, Bae A (2018) Diffusion size and structural virality: the effects of message and network features on spreading health information on Twitter. Comput Hum Behav 89:111–120
Olorunnimbe MK, Viktor HL, Paquet E (2018) Dynamic adaptation of online ensembles for drifting data streams. J Intell Inf Syst 50(2):291–313
Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45
Polikar R, Upda L, Upda SS, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans Systems Man Cybern 4:497–508
Ren S, Liao B, Zhu W, Li Z, Liu W, Li K (2018) The gradual resampling ensemble for mining imbalanced data streams with concept drift. Neurocomputing 286:150–166
Ren Y, Zhang L, Suganthan PN (2016) Ensemble classification and regression—recent developments, applications and future directions. IEEE Comput Intell Mag 1(1):41–43
Tabassum N, Ahmed T (2016) A theoretical study on classifier ensemble methods and its applications. In: Proceedings of the 3rd international conference on computing for sustainable global development, pp 67–78
Tong S, Koller D (2002) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66
Tsymbal A (2004) The problem of concept drift: definitions and related work. Tech. rep., Department of Computer Science, Trinity College Dublin
Vapnik V (1999) The nature of statistical learning theory. Information science and statistics, Springer, New York
Vilas AF, Redondo RPD, Crockett K, Owda M, Evans L (2019) Twitter permeability to financial events: an experiment towards a model for sensing irregularities. Multimed Tools Appl 78(7):9217–9245
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101
Xing W, Gao F (2018) Exploring the relationship between online discourse and commitment in twitter professional learning communities. Comput Educ 126:388–398
Zliobaite I (2010) Learning under concept drift: an overview. Tech. rep., Vilnius University, Faculty of Mathematics and Informatics
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Costa, J., Silva, C., Antunes, M. et al. Boosting dynamic ensemble’s performance in Twitter. Neural Comput & Applic 32, 10655–10667 (2020). https://doi.org/10.1007/s00521-019-04599-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-019-04599-7