Abstract
In real-world applications, it has been observed that class imbalance (significant differences in class prior probabilities) may produce an important deterioration of the classifier performance, in particular with patterns belonging to the less represented classes. One method to tackle this problem consists to resample the original training set, either by over-sampling the minority class and/or under-sampling the majority class. In this paper, we propose two ensemble models (using a modular neural network and the nearest neighbor rule) trained on datasets under-sampled with genetic algorithms. Experiments with real datasets demonstrate the effectiveness of the methodology here proposed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barandela, R., Sánchez, J.S., García, V., Rangel, E.: Strategies for Learning in Class Imbalance Problems. Pattern Recognition 36, 849–851 (2003)
Woods, K., Doss, C., Bowyer, K.W., Solk, J., Priebe, C., Kegelmeyer, W.P.: Comparative Evaluation of Pattern Recognition Techniques for Detection of Microcalcifications in Mammography. International Journal of Pattern Recognition and Artificial Intelligence 7, 1417–1436 (1993)
Fawcett, T., Provost, F.: Adaptive Fraud Detection. Data Mining and Knowledge Discovery 1, 291–316 (1996)
Tan, S.: Neighbor-weighted K-Nearest Neighbour for Unbalanced Text Corpus. Expert Systems with Applications 28, 667–671 (2005)
Huang, Y., Hung, C., Jiau, H.C.: Evaluation of Neural Networks and Data Mining Methods on a Credit Assessment Task for Class Imbalance Problem. Nonlinear Analysis: Real World Applications 7, 720–747 (2006)
Barandela, R., Valdovinos, R.M., Sánchez, J.S., Ferri, F.J.: The Imbalanced Training Sample Problem: Under or Over Sampling? In: Fred, A., Caelli, T.M., Duin, R.P.W., Campilho, A.C., de Ridder, D. (eds.) SSPR&SPR 2004. LNCS, vol. 3138, pp. 806–814. Springer, Heidelberg (2004)
Ezawa, K.J., Singh, M., Norton, S.W.: Learning Goal Oriented Bayesian Networks for Telecommunication Risk Management. In: Proceedings of the 13th International Conference on Machine Learning, pp. 139–147 (1996)
Ranawana, R., Palade, V.: Optimized Precision – A New Measure for Classifier Performance Evaluation. In: Proceedings IEEE Congress on Evolutionary Computation, pp. 2254–2261 (2004)
Daskalaki, S., Kopanas, I., Avouris, N.: Evaluation of Classifiers for an Uneven Class Distributions Problem. Applied artificial intelligence 20, 381–417 (2006)
Prati, R.C., Batista, G.E.A.P.A., Monard, M.C.: Learning with class skews and small disjuncts. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS, vol. 3171, pp. 296–306. Springer, Heidelberg (2004)
Prati, R.C., Batista, G.E.A.P.A., Monard, M.C.: Class Imbalance Versus Class Overlapping: An Analysis of a Learning System Behavior. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS, vol. 2972, pp. 312–321. Springer, Heidelberg (2004)
Batista, G.E., Pratti, R.C., Monard, M.C.: A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. SIGKDD Explorations 6, 20–29 (2004)
Chawla, N.V., Bowyer, K.W., Hall, L., Kegelmeyer, W.P.: SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Dietterich, T.G.: Machine Learning Research: Four Current Directions. AI Mag. 68, 97–136 (1997)
Jacobs, R., Jordan, M., Hinton, G.: Adaptive Mixture of Local Experts. Neural Computation 3(1), 79–87 (1991)
Holland, J.: Adaptation in Natural and Artificial System. The University of Michigan Press (1975)
Diaz, R.I., Valdovinos, R.M., Pacheco, J.H.: Comparative Study of Genetic Algorithms and Resampling Methods for Ensemble Constructing. In: Proceedings of IEEE Congress on Evolutionary Computation, Hong Kong, China, pp. 4180–4184 (2008)
Bauckhage, C., Thurau, C.: Towards a Fair’n Square Aimbot - Using Mixture of Experts to Learn Context Aware Weapon Handling. In: Proceedings of GAME-ON, Ghent, Belgium, pp. 20–24 (2004)
Hartono, P., Hashimoto, S.: Ensemble of Linear Perceptrons with Confidence Level Output. In: Proceedings of the 4th Intl. Conf. on Hybrid Intelligent Systems, Kitakyushu, Japan, pp. 186–191 (2004)
Zaman, R., Wunsch III, D.C.: TD Methods Applied to Mixture of Experts for Learning 9x9 Goevaluation Function. In: Proceedings of IEEE/INNS Intl. Joint Conf. on Neural Networks, Washington, DC, pp. 3734–3739 (1999)
Dasarathy, V.: Nearest Neighbor Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases, Dept. of Information and Computer Science, Univ. of California, Irvine, CA (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cleofas, L., Valdovinos, R.M., García, V., Alejo, R. (2009). Use of Ensemble Based on GA for Imbalance Problem. In: Yu, W., He, H., Zhang, N. (eds) Advances in Neural Networks – ISNN 2009. ISNN 2009. Lecture Notes in Computer Science, vol 5552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01510-6_62
Download citation
DOI: https://doi.org/10.1007/978-3-642-01510-6_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01509-0
Online ISBN: 978-3-642-01510-6
eBook Packages: Computer ScienceComputer Science (R0)