Boosting dynamic ensemble’s performance in Twitter | Neural Computing and Applications Skip to main content
Log in

Boosting dynamic ensemble’s performance in Twitter

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Many text classification problems in social networks, and other contexts, are also dynamic problems, where concepts drift through time, and meaningful labels are dynamic. In Twitter-based applications in particular, ensembles are often applied to problems that fit this description, for example sentiment analysis or adapting to drifting circumstances. While it can be straightforward to request different classifiers' input on such ensembles, our goal is to boost dynamic ensembles by combining performance metrics as efficiently as possible. We present a twofold performance-based framework to classify incoming tweets based on recent tweets. On the one hand, individual ensemble classifiers' performance is paramount in defining their contribution to the ensemble. On the other hand, examples are actively selected based on their ability to effectively contribute to the performance in classifying drifting concepts. The main step of the algorithm uses different performance metrics to determine both each classifier strength in the ensemble and each example importance, and hence lifetime, in the learning process. We demonstrate, on a drifted benchmark dataset, that our framework drives the classification performance considerably up for it to make a difference in a variety of applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://dev.Twitter.com/.

  2. http://svmlight.joachims.org/.

References

  1. Ahsan MI, Nahian T, Kafi AA, Hossain MI, Shah FM (2016) Review spam detection using active learning. In: Proceedings of the 7th annual conference on information technology, electronics and mobile communication, pp 1–7

  2. Almeida PR, Oliveira LS, Britto AS, Sabourin R (2018) Adapting dynamic classifier selection for concept drift. Expert Syst Appl 104:67–85

    Google Scholar 

  3. Bagul RD, Phulpagar BD (2016) Survey on approaches, problems and applications of ensemble of classifiers. Int J Emerg Trends Technol Comput Sci 5(1):28–30

    Google Scholar 

  4. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    MATH  Google Scholar 

  6. Costa J, Silva C, Antunes M, Ribeiro B (2013) Defining semantic meta-hashtags for Twitter classification. In: Proceedings of the 11th international conference on adaptive and natural computing algorithms, pp 226–235

  7. Costa J, Silva C, Antunes M, Ribeiro B (2015a) Dots: drift oriented tool system. In: Proceedings of the 22nd international conference on neural information processing (ICONIP), pp 615–623

  8. Costa J, Silva C, Antunes M, Ribeiro B (2015b) The impact of longstanding messages in micro-blogging classification. In: International joint conference on neural networks (IJCNN), pp 1–8

  9. Costa J, Silva C, Antunes M, Ribeiro B (2016) Choice of best samples for building ensembles in dynamic environments. In: Engineering applications of neural networks, pp 35–47

  10. Costa J, Silva C, Antunes M, Ribeiro B (2017a) Adaptive learning for dynamic environments: a comparative approach. Eng Appl Artif Intell 65:336–345

    Google Scholar 

  11. Costa J, Silva C, Antunes M, Ribeiro B (2017b) Performance metrics for model fusion in twitter data drifts. In: Proceedings of the 8th Iberian conference on pattern recognition and image analysis, pp 13–21

  12. Costa J, Silva C, Antunes M, Ribeiro B (2018) Adaptive learning models evaluation in twitter’s timelines. In: International joint conference on neural networks (IJCNN)

  13. Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Intell Mag 10(4):12–25

    Google Scholar 

  14. Doerr B, Fouz M, Friedrich T (2012) Why rumors spread so quickly in social networks. Commun ACM 55(6):70–75

    Google Scholar 

  15. Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531

    Google Scholar 

  16. Faris H, Al-Zoubi AM, Heidari AA, Aljarah I, Mafarja M, Hassonah MA, Fujita H (2019) An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf Fusion 48:67–83

    Google Scholar 

  17. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    MathSciNet  MATH  Google Scholar 

  18. Haque ME, Alkharobi TM (2015) Adaptive hybrid model for network intrusion detection and comparison among machine learning algorithms. Int J Mach Learn Comput 5(1):17

    Google Scholar 

  19. Huang J, Tang Y, Hu Y, Li J, Hu C (2019) Predicting the active period of popularity evolution: a case study on Twitter hashtags. Inf Sci. https://doi.org/10.1016/j.ins.2019.04.028

    Article  Google Scholar 

  20. Joachims T (2002) Learning to classify text using support vector machines: methods, theory and algorithms. Kluwer Academic Publishers, Norwell, MA, USA

    Google Scholar 

  21. Karnick M, Muhlbaier MD, Polikar R (2008) Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: International conference on pattern recognition, pp 1–4

  22. Kim J, Bentley P, Aickelin U, Greensmith J, Tedesco G, Twycross J (2007) Immune system approaches to intrusion detection—a review. Natural Comput 6(4):413–466

    MathSciNet  MATH  Google Scholar 

  23. Kolter JZ, Maloof MA (2003) Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Proceedings of the 3rd IEEE international conference on data mining, p 123

  24. Kuncheva L (2002) A theoretical study on six classifier fusion strategies. IEEE Tran Pattern Anal Mach Intell 24(2):281–286

    Google Scholar 

  25. Meng J, Peng W, Tan PN, Liu W, Cheng Y, Bae A (2018) Diffusion size and structural virality: the effects of message and network features on spreading health information on Twitter. Comput Hum Behav 89:111–120

    Google Scholar 

  26. Olorunnimbe MK, Viktor HL, Paquet E (2018) Dynamic adaptation of online ensembles for drifting data streams. J Intell Inf Syst 50(2):291–313

    Google Scholar 

  27. Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45

    Google Scholar 

  28. Polikar R, Upda L, Upda SS, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans Systems Man Cybern 4:497–508

    Google Scholar 

  29. Ren S, Liao B, Zhu W, Li Z, Liu W, Li K (2018) The gradual resampling ensemble for mining imbalanced data streams with concept drift. Neurocomputing 286:150–166

    Google Scholar 

  30. Ren Y, Zhang L, Suganthan PN (2016) Ensemble classification and regression—recent developments, applications and future directions. IEEE Comput Intell Mag 1(1):41–43

    Google Scholar 

  31. Tabassum N, Ahmed T (2016) A theoretical study on classifier ensemble methods and its applications. In: Proceedings of the 3rd international conference on computing for sustainable global development, pp 67–78

  32. Tong S, Koller D (2002) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66

    MATH  Google Scholar 

  33. Tsymbal A (2004) The problem of concept drift: definitions and related work. Tech. rep., Department of Computer Science, Trinity College Dublin

  34. Vapnik V (1999) The nature of statistical learning theory. Information science and statistics, Springer, New York

    Google Scholar 

  35. Vilas AF, Redondo RPD, Crockett K, Owda M, Evans L (2019) Twitter permeability to financial events: an experiment towards a model for sensing irregularities. Multimed Tools Appl 78(7):9217–9245

    Google Scholar 

  36. Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101

    Google Scholar 

  37. Xing W, Gao F (2018) Exploring the relationship between online discourse and commitment in twitter professional learning communities. Comput Educ 126:388–398

    Google Scholar 

  38. Zliobaite I (2010) Learning under concept drift: an overview. Tech. rep., Vilnius University, Faculty of Mathematics and Informatics

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joana Costa.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Costa, J., Silva, C., Antunes, M. et al. Boosting dynamic ensemble’s performance in Twitter. Neural Comput & Applic 32, 10655–10667 (2020). https://doi.org/10.1007/s00521-019-04599-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04599-7

Keywords

Navigation