Abstract
Real-world classification problems usually involve imbalanced data sets. In such cases, a classifier with high classification accuracy does not necessarily imply a good classification performance for all classes. The Area Under the ROC Curve (AUC) has been recognized as a more appropriate performance indicator in such cases. Quite a few methods have been developed to design classifiers with the maximum AUC. In the context of Neural Networks (NNs), however, it is usually an approximation of AUC rather than the exact AUC itself that is maximized, because AUC is non-differentiable and cannot be directly maximized by gradient-based methods. In this paper, we propose to use evolutionary algorithms to train NNs with the maximum AUC. The proposed method employs AUC as the objective function. An evolutionary algorithm, namely the Self-adaptive Differential Evolution with Neighborhood Search (SaNSDE) algorithm, is used to optimize the weights of NNs with respect to AUC. Empirical studies on 19 binary and multi-class imbalanced data sets show that the proposed evolutionary AUC maximization (EAM) method can train NN with larger AUC than existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: 15th International Conference on Machine Learning, pp. 445–453. AAAI Press, Menlo Park (1998)
Weiss, G.M.: Mining with rarity: a unifying framework. ACM SIGKDD Explorations Newsletter 6(1), 7–19 (2004)
Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 861–874 (2006)
Ferri, C., Flach, P., Hernández-Orallo, J.: Decision trees learning using the area under the ROC curve. In: 19th International Conference on Machine Learning, pp. 139–146. Morgan Kaufmann, San Francisco (2002)
Caruana, R., Niculescu-Mizil, A.: An empirical comparison of supervised learning algorithms. In: 23rd International Conference on Machine Learning, pp. 161–168. ACM Press, New York (2006)
Brefeld, U., Scheffer, T.: AUC maximizing support vector learning. In: Proc. ICML Workshop on ROC Analysis in Machine Learning (2005)
Yan, L., Dodier, R., Mozer, M.C., Wolniewicz, R.: Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In: 20th International Conference on Machine Learning, vol. 20(2), pp. 848–855. AAAI Press, Menlo Park (2003)
Cortes, C., Mohri, M.: AUC optimization vs. error rate minimization. Advances in Neural Information Processing Systems 16, 313–320 (2004)
Huang, J., Ling, C., Zhang, H., Matwin, S.: Proper model selection with significance test. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 536–547. Springer, Heidelberg (2008)
Herschtal, A., Raskutti, B.: Optimizing area under the ROC curve using gradient descent. In: 21st International Conference on Machine Learning, vol. 69, pp. 49–56. ACM Press, New York (2004)
Calders, T., Jaroszewicz, S.: Efficient AUC optimization for classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 42–53. Springer, Heidelberg (2007)
Vanderlooy, S., Hüllermeier, E.: A critical analysis of variants of the AUC. Machine Learning 72, 247–262 (2008)
Yang, Z., Tang, K., Yao, X.: Self-adaptive differential evolution with neighborhood search. In: Proceedings of the 2008 Congress on Evolutionary Computation, pp. 1110–1116 (2008)
Yao, X., Liu, Y.: A New Evolutionary System for Evolving Artificial Neural Networks. IEEE Transaction on Neural Networks 8(3), 694–713 (1997)
Hasheminia, H., Niaki, S.T.A.: A Hybrid Method of Neural Networks and Genetic Algorithm in Econometric Modeling and Analysis. Journal of Applied Science 8(16), 2825–2833 (2008)
Shanthi, D., Sahoo, G., Saravanan, N.: Evolving Connection Weights of Artificial Neural Networks Using Genetic Algorithm with Application to the Prediction of Stroke Disease. International Journal of Soft Computing 4(2), 95–102 (2009)
Hand, D.J., Till, R.J.: A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2), 171–186 (2001)
Price, K., Storn, R., Lampinen, J.: Differential Evolution: A Practical Approach to Global Optimization. Springer, Berlin (2005)
Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://archive.ics.uci.edu/ml/datasets.html
Demšar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7, 1–30 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lu, X., Tang, K., Yao, X. (2010). Evolving Neural Networks with Maximum AUC for Imbalanced Data Classification. In: Graña Romay, M., Corchado, E., Garcia Sebastian, M.T. (eds) Hybrid Artificial Intelligence Systems. HAIS 2010. Lecture Notes in Computer Science(), vol 6076. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13769-3_41
Download citation
DOI: https://doi.org/10.1007/978-3-642-13769-3_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13768-6
Online ISBN: 978-3-642-13769-3
eBook Packages: Computer ScienceComputer Science (R0)