Abstract
Software defect prediction contributes to ensuring the quality of software development and reducing software maintenance costs. However, the class imbalance problem can affect the accuracy of defect prediction classification, which is a crucial issue to be solved urgently. We propose a novel software defect prediction model based on a twin support vector machine to address imbalanced data classification issues and optimize the prediction effect. The model embeds the within-class structure of the training samples as the regularization term into the objective function, considering the structural information hidden in the data, and obtains the class structure information through clustering. Moreover, by introducing within-class structure information to maximize the within-class distances and one class intervals, the model produces a superior classification hyperplane and enhances the generalization ability of the support vector machine. The experimental results demonstrate that the proposed algorithm achieves higher prediction accuracy, more robust adaptability, and optimized performance in classifying imbalanced data compared with existing algorithms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
Inquiries about data availability should be directed to the authors.
References
Adak MF (2018) Software defect detection by using data mining based fuzzy logic. In: 2018 Sixth international conference on digital information, networking, and wireless communications (DINWC), pp.65–69. IEEE Press
Agarwal S, Tomar D, Verma S (2014) Prediction of software defects using Twin Support Vector Machine. In: 2014 International Conference on Information Systems and Computer Networks (ISCON), pp.128–132. IEEE Press
Andreou AS, Chatzis SP (2016) Software defect prediction using doubly stochastic Poisson processes driven by stochastic belief networks. J Syst Softw 122:72–82
Chen X, Zhang D, Zhao Y, Cui Z, Ni C (2019) Software defect number prediction: Unsupervised vs supervised methods. Inf Softw Technol 106:161–181
Dekhandji FZ (2017) Signal processing deployment in power quality disturbance detection and classification. Acta Phys Pol. Ser. A 132(3):415–419
Dekhandji FZ, Talhaoui S, Arkab Y (2019) Power quality detection, classification and monitoring using LABVIEW. Alger J Signals Syst 4(2):101–111
Ganeshkumar P, Kalaivani S (2015) Predicting software defects using linear twin cores Vector machine model. Int Res J Eng Technol 2:665–670
Du Y, Zhang L, Shi J, Tang J, Yin Y (2018) Feature-grouping-based two steps feature selection algorithm in software defect prediction. In: Proceedings of the 2nd international conference on advances in image processing (ICAIP '18), pp.173–178. Association for Computing Machinery
Elish KO, Elish MO (2008) Predicting defect-prone software modules using support vector machines. J Syst Softw 81(5):649–660
Fu Y, Dong W, Yin L, Du Y (2017) Software defect prediction model based on the combination of machine learning algorithms. J Comput Res Dev 54(3):633–641
Gao Y, Yang C (2019) Software defect prediction based on manifold learning in subspace selection. In: Proceedings of the 2016 international conference on intelligent information processing (ICIIP' 16), pp.1–6. Association for Computing Machinery
Ghosh S, Rana A, Kansal V (2018) A nonlinear manifold detection based model for software defect prediction. Proc Comput Sci 132:581–594
Ghotra B, McIntosh S, Hassan AE (2015) Revisiting the impact of classification techniques on the performance of defect prediction models. In: 2015 37th international conference on software engineering, 1, pp.789–800. IEEE Press
Huang H, Wei X, Zhou Y (2018) Twin support vector machines: a survey. Neurocomputing 300:34–43
Ibrahim DR, Ghnemat R, Hudaib A (2017) Software defect prediction using feature selection and random forest algorithm. In: 2017 International Conference on New Trends in Computing Sciences (ICTCS), pp.252–257. IEEE Press
Jayadeva A, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal 29(5):905–910
Jayanthi R, Florence L (2019) Software defect prediction techniques using metrics based on neural network classifier. Clust Comput 22(1):77–88
Jing X-Y, Ying S, Zhang Z-W, Wu S-S, Liu J (2014) Dictionary learning based software defect prediction. In: Proceedings of the 36th International Conference on Software Engineering ((ICSE 2014)), pp.414–423. Association for Computing Machinery
Kalai MR, Jacob SG (2015) Improved random forest algorithm for software defect prediction through data mining techniques. Int J Comput Appl 117(23):18–22
Khoshgoftaar TM, Lanning DL, Pandya AS (1994) A comparative study of pattern recognition techniques for quality evaluation of telecommunications software. IEEE J Sel Areas Commun 12(2):279–291
Laradji IH, Alshayeb M, Ghouti L (2015) Software defect prediction using ensemble learning on selected features. Inform Software Tech 58:388–402
Li J, He P, Zhu J, Lyu MR (2017) Software defect prediction via convolutional neural network. In: 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp.318–328. IEEE Press
Lin M, Tang K, Yao X (2013) Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans Neural Netw Learn Syst 24(4):647–660
Liu X-Y, Wu J, Zhou Z-H (2009) exploratory undersampling for class-imbalance learning. IEEE trans Syst man cyb 39(2):539–550
Liu M, Miao L, Zhang D (2014a) Two-stage cost-sensitive learning for software defect prediction. IEEE Trans Reliab 63(2):676–686
Liu W, Chen X, Gu Q, Liu S, Chen D (2016) A cluster-analysis-based feature-selection method for software defect prediction. Sci Sin Inf 46(1674–7267):1298
Liu S, Chen X, Liu W, Chen J, Gu Q, Chen D (2014b) FECAR: a feature selection framework for software defect prediction. In: 2014b IEEE 38th annual computer software and applications conference, pp.426–435. IEEE Press
Malhotra R, Kamal S (2019) An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data. Neurocomputing 343:120–140
Mangasarian OL, Musicant DR (1999) Successive overrelaxation for support vector machines. IEEE Trans Neural Networ 10(5):1032–1037
Marandi AK, Khan DA (2015) An impact of linear regression models for improving the software quality with estimated cost. Proc Comput Sci 54:335–342
Ni C, Chen X, Xia X, Gu Q, Zhao Y (2019) Multitask defect prediction. J Softw-Evol Proc 31(12):e2203
Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31(4):340–355
Rong X, Li F, Cui Z (2016) A model for software defect prediction using support vector machine based on CBA. Int J Intell Syst Technol Appl 15(1):19–34
Shao Y, Zhang C-H, Wang X-B, Deng N-Y (2011) Improvements on twin support vector machines. IEEE Trans Neural Networ 22(6):962–968
Shao Y-H, Wang Z, Chen W-J, Deng N-Y (2013) A regularization for the projection twin support vector machine. Knowl-Based Syst 37:203–210
Shao Y-H, Chen W-J, Zhang J-J, Wang Z, Deng N-Y (2014) An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recognit 47(9):3158–3167
Sharma D, Chandra P (2018) software fault prediction using machine-learning techniques. In: Proceedings of the first international conference on SCI, 2, pp.541–549. Springer
Singh PD, Chug A (2017) Software defect prediction analysis using machine learning algorithms. In: 2017 7th international conference on cloud computing, data science & engineering - confluence, pp.775–781. IEEE Press
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2019) The impact of automated parameter optimization on defect prediction models. IEEE Trans Software Eng 45(7):683–711
Tantithamthavorn C, Hassan AE, Matsumoto K (2018) The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans Softw Eng 46(11):1200–1219
Tomar D, Agarwal S (2015) Twin support vector machine: a review from 2007 to 2014. Egypt Inform J 16(1):55–69
Tomar D, Agarwal S (2016) Prediction of defective software modules using class imbalance learning. Appl Comput Intell Soft Comput 2016:1–12
Valles-Barajas F (2015) A comparative analysis between two techniques for the prediction of software defects: fuzzy and statistical linear regression. Innov Syst Softw Eng 11(4):277–287
Vapnik VN (1995) The nature of statistical learning theory. Springer, Berlin
Wahono R (2015) A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks. J Softw Eng 1:1–16
Wang BX, Japkowicz N (2010) Boosting support vector machines for imbalanced data sets. Knowl Inf Syst 25(1):1–20
Wang X, Niu Y (2013) New one-versus-all ν-SVM solving intra–inter class imbalance with extended manifold regularization and localized relative maximum margin. Neurocomputing 115:106–121
Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443
Wei H, Hu C, Chen S, Xue Y, Zhang Q (2019) Establishing a software defect prediction model via effective dimension reduction. Inform Sciences 477:399–409
Wu S-H, Lin K-P, Chen C-M, Chen M-S (2008) Asymmetric support vector machines: low false-positive learning under the user tolerance. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD' 08), pp.749–757. Association for Computing Machinery
Xiao P, Liu B, Wang S (2018) Feedback-based integrated prediction: defect prediction based on feedback from software testing process. J Syst Softw 143:159–171
Xu Z, Liu J, Luo X, Yang Z, Zhang Y, Yuan P et al (2019) Software defect prediction based on kernel PCA and weighted extreme learning machine. Inform Softw Tech 106:182–200
Yan Z, Chen X, Guo P (2010) Software defect prediction using fuzzy support vector regression. In: the 7th international symposium on neural networks (ISNN 2010), pp.17–24. Springer, Berlin, Heidelberg
Yu Q, Jiang SJ, Zhang YM, Wang XY, Gao PF, Qian J (2018) The impact study of class imbalance on the performance of software defect prediction models. Chin J Comput 41(4):809–824
Zheng J (2010) Cost-sensitive boosting neural networks for software defect prediction. Expert Syst Appl 37(6):4537–4543
Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering( ESEC/FSE '09), pp.91–100. Association for Computing Machinery (2009)
Funding
This work was supported by the National Natural Science Foundation of Guangxi (No. 2022GXNSFAA035552, 2021GXNSFAA220114), the Guangxi University Young Teachers Foundation Competence Improvement Project (No. 2021KY0592), Natural Science Foundation of China (No. 12261096), Guangxi Natural Science Foundation (No. 2020GXNSFAA159155), and the Natural Science Foundation of Yulin City of China (No. 202125001).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
All authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, J., Lei, J., Liao, Z. et al. Software defect prediction model based on improved twin support vector machines. Soft Comput 27, 16101–16110 (2023). https://doi.org/10.1007/s00500-023-07984-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-07984-6