Abstract
Ensemble approaches have revealed remarkable abilities to tackle different learning challenges, namely in dynamic scenarios with concept drift, e.g. in social networks, as Twitter. Several efforts have been engaged in defining strategies to combine the models that constitute an ensemble. In this work, we investigate the effect of using different metrics for combining ensembles’ models, specifically performance-based metrics. We propose five performance combining metrics, having in mind that we may take advantage of diversity in classifiers, as their individual performance takes a leading role in defining their contribution to the ensemble. Experimental results on a Twitter dataset, artificially timestamped, suggest that using performance metrics to combine the models that constitute an ensemble can introduce relevant improvements in the overall ensemble performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Bagul, R.D., Phulpagar, B.D.: Survey on approaches, problems and applications of ensemble of classifiers. Int. J. Emerg. Trends Technol. Comput. Sci. 5(1), 28–30 (2016)
Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)
Tabassum, N., Ahmed, T.: A theoretical study on classifier ensemble methods and its applications. In: 3rd International Conference on Computing for Sustainable Global Development, pp. 67–78 (2016)
Ren, Y., Zhang, L., Suganthan, P.N.: Ensemble classification and regression - recent developments, applications and future directions. IEEE Comput. Intell. Mag. 1(1), 41–43 (2016)
Ponti Jr., M.P.: Combining classifiers: from the creation of ensembles to the decision fusion. In: 24th Conference on Graphics, Patterns and Images, pp. 1–10 (2011)
Faria, E., de Carvalho, A., Gonçalves, I., Gama, J.: Novelty detection in data streams. Artif. Intell. Rev. 45(2), 235–269 (2016)
Kuncheva, L.: A theoretical study on six classifier fusion strategies. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 281–286 (2002)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22, 1517–1531 (2011)
Karnick, M., Muhlbaier, M.D., Polikar, R.: Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: International Conference on Pattern Recognition, pp. 1–4 (2008)
Johnson, S.: How Twitter will change the way we live. Time Mag. 173, 23–32 (2009)
Tsur, O., Rappoport, A.: What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the 5th International Conference on Web Search and Data Mining, pp. 643–652 (2012)
Yang, L., Sun, T., Zhang, M., Mei, Q.: We know what @you #tag: does the dual role affect hashtag adoption? In: Proceedings of the 21st International Conference on World Wide Web, pp. 261–270 (2012)
Chang, H.-C.: A new perspective on Twitter hashtag use: diffusion of innovation theory. In: Proceedings of the 73rd Annual Meeting on Navigating Streams in an Information Ecosystem, pp. 85:1–85:4 (2010)
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Defining semantic meta-hashtags for Twitter classification. In: Tomassini, M., Antonioni, A., Daolio, F., Buesser, P. (eds.) ICANNGA 2013. LNCS, vol. 7824, pp. 226–235. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37213-1_24
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Choice of best samples for building ensembles in dynamic environments. In: Jayne, C., Iliadis, L. (eds.) EANN 2016. CCIS, vol. 629, pp. 35–47. Springer, Cham (2016). doi:10.1007/978-3-319-44188-7_3
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: The impact of longstanding messages in micro-blogging classification. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Concept drift awareness in Twitter streams. In: Proceedings of the 13th International Conference on Machine Learning and Applications, pp. 294–299 (2014)
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: DOTS: drift oriented tool system. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9492, pp. 615–623. Springer, Cham (2015). doi:10.1007/978-3-319-26561-2_72
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1999)
Acknowledgment
It is also financed by national funding via the Foundation for Science and Technology and by the European Regional Development Fund (FEDER), through the COMPETE 2020 - Operational Program for Competitiveness and Internationalization (POCI).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Costa, J., Silva, C., Antunes, M., Ribeiro, B. (2017). Performance Metrics for Model Fusion in Twitter Data Drifts. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-58838-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58837-7
Online ISBN: 978-3-319-58838-4
eBook Packages: Computer ScienceComputer Science (R0)