Abstract
This study proposes a novel PSO–CS-SVM model that hybridizes the particle swarm optimization (PSO) and cost sensitive support vector machine (CS-SVM) to deal with the problem of unbalanced data classification and asymmetry misclassification cost in loan default discrimination problem. Cost sensitive learning is applied to the standard SVM by integrating misclassification cost of each sample into standard SVM and PSO is employed for parameter determination of the CS-SVM. Meantime, the financial data are discretized by using the self-organizing mapping neural network. And the evaluation indices are reduced without information loss by genetic algorithm for decreasing the complexity of the model. The effectiveness of integrated model of CS-SVM and PSO is verified by three experiments comparing with traditional CS-SVM, PSO–SVM, SVM and BP neural network through real loan default data of companies in China. The corresponding results indicate that the accuracy rate, hit rate, covering rate and lift coefficient are improved dramatically by the developed approach. The proposed method can control the different types of errors distribution with various cost of misclassification accurately, reduce the total misclassification cost largely, and distinguish the loan default problems effectively.
Similar content being viewed by others
References
Bhekisipho T (2009) Multiple classifier application to credit risk assessment. Expert Syst Appl 37(4):3326–3336
Ma RW, Tang CY (2007) Building up default predicting model based on logistic model and misclassification loss. Syst Eng Theory Pract 27:33–38
Ke KL, Feng ZX (2008) Short-term loan default prediction based on integration of rough sets and genetic algorithm. Syst Eng Theory Pract 28(4):27–34
Min SH, Lee J, Han I (2006) Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Syst Appl 31(3):652–660
Min JH, Jeong C (2009) A binary classification method for bankruptcy prediction. Expert Syst Appl 36:5256–5263
Yang ZJ, You WJ, Ji GL (2011) Using partial least squares and support vector machines for bankruptcy prediction. Expert Syst Appl 38:8336–8342
Huang Z, Chen H, Hsu CJ et al (2004) Credit rating analysis with support vector machines and neural networks: a market comparative study. Decis Support Syst 37(4):543–558
Ong CS, Huang JJ, Tzeng GH (2005) Building credit scoring models using genetic programming. Expert Syst Appl 29(1):41–47
Chawla N, Japkowicz N, Kolcz A (2004) Editorial: special issues on learning from imbalanced data sets. SIGKDD Explor 6:1–6
Mccarthy K, Zabar B, Weiss G (2005) Does cost-sensitive learning beat sampling for classifying rare classes? In: proceedings of the ACM SIGKDD first international workshop on utility-based data mining. ACM Press, pp. 69–75
Tsai CH, Chang LC, Chiang HC (2009) Forecasting of ozone episode days by cost-sensitive neural network methods. Sci Total Environ 407(6):2124–2135
Maloof M (2003) Learning when data sets are imbalanced and when costs are unequal and unknown. In: proceedings of the ICML-2003 workshop: learning with imbalanced data sets II, pp. 73–f80
Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18:63–77
Li Z, Ling L, Lian D (2012) Business intelligence in enterprise computing environment. Inf Technol Manage 13:297–310
Yu L, Yao X, Wang SY, Lai KK (2011) Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection. Expert Syst Appl 38:15392–15399
Pontil M, Verri A (1998) Support vector machines for 3D object recognition. IEEE Trans Pattern Anal Mach Intell 20(6):637–646
Yu GX, Ostrouchov G, Geist A, et al (2003) An SVM based algorithm for identification of photosynthesis-specific genome features. Second IEEE computer society bioinformatics conference. CA, USA, pp. 235–243
Joachims T (1998) Text categorization with SVM: learning with many relevant features. In: proceedings of ECML-98,10th European conference on machine learning, Vol. 1398
Tong S, Chang E (2001) Support vector machine active learning for image retrieval. In: proceedings of ACM international conference on multimedia, pp. 107–118
Wu G, Chang E (2003) Class-boundary alignment for imbalanced dataset learning. In: ICML 2003 workshop on learning from imbalanced data sets II. Washington, DC
Cristianini N, Kandola J, Elisseeff A, et al (2001) On kernel target alignment. In: Advances in neural information processing systems, vol 14, pp 367–373
Veropoulos K, Campbell C, Cristianini N (1999) Controlling the sensitivity of support vector machines. In: Dean T (ed) IJCAI: proceedings of international joint conference on artificial intelligence. Morgan Kaufmann, Stockholm, pp 55–60
Pardo M, Sberveglieri G (2005) Classification of electronic nose data with support vector machines. Sens Actuators B Chem 107(2):730–737
Pai PF, Hong WC (2005) Support vector machines with simulated annealing algorithms in electricity load forecasting. Energy Convers Manage 46(17):2669–2688
Huang CL, Chen MC, Wang CJ (2007) Credit scoring with A data mining approach based on support vector machines. Expert Syst Appl 33(4):847–856
Lin SW, Ying KC, Chen SC et al (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35(4):1817–1824
Zhang L, Jack LB, Nandi AK (2005) Fault detection using genetic programming. Mech Syst Signal Process 19:271–289
Huang CL, Wang CJ (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31:231–240
Huang CL, Dun JF (2008) A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8:1381–1391
Cao J, Lu HK, Wang WW et al (2012) A novel five-category loan-risk evaluation model using multiclass LS-SVM By PSO. Int Journal Inf Technol Decis Mak 11(4):857–874
Burgers CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2:121–167
Shi Y, Eberhart RC (1998) A modified particle swarm optimizer. In: proceeding of the IEEE congress on evolutionary computation, pp. 69–73
Xia KW, Dong Y, Du HL (2007) Oil layer recognition model of LS-SVM based on improved PSO algorithm. Control Decis 22(12):1385–1389
Tao Z, Xu BD, Wang DW et al (2003) Rough set knowledge reduction approach based on GA. Syst Eng 21(4):116–122
Ke KL, Feng ZX (2008) Five-category classification of loan risk based on integration of rough sets and neural network system. Control Theory Appl 25(4):759–763
Kohonen T (1989) Self-organization and associative memory. Springer-Verlag, New York
Kohonen T (1995) Self-organizing maps. Springer, Berlin. Vol.27, No. 2, pp. 278–279
Wu DS, Liang L (2004) Research of credit score based on V-fold cross-validation and elman neural networks. Syst Eng Theory Pract 4:92–98
Xue F, Ke KL (2008) Five-category evaluation of commercial bank’s loan based on integration of rough sets and neural network. Syst Eng Theory Pract 1:40–45
Zhang M, Zhou ZF (2009) An evaluation model for credit risk of enterprise based on multi-objective programming and support vector machines. China soft sci mag 04:185–190
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at: http://www.csie.ntu.edu.tw/cjlin/libsvm
Oos P, Vanhoof K, Ooghe H (1999) Credit classification: a comparison of logic models and decision. In: proceedings of European conference on machine learning. Chemnitz: [s. n.]
Acknowledgments
The authors would like to thank the anonymous referees for their valuable comments and suggestions. Their comments helped to improve the quality of the paper immensely. This work is partially supported by NSFC (60804047), the Science and Technology Project of Jiangsu province, China (BE2010201), Ministry of education, humanities and social sciences research project (11YJCZH005), Jiangsu provincial department of education philosophy and social science project (2010SJB790025) and the Priority Academic Program Development of Jiangsu Higher Education Institutions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cao, J., Lu, H., Wang, W. et al. A loan default discrimination model using cost-sensitive support vector machine improved by PSO. Inf Technol Manag 14, 193–204 (2013). https://doi.org/10.1007/s10799-013-0161-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10799-013-0161-1