Abstract
The major challenge is to validate software failure dataset by finding unknown model parameters used. For software assurance, previously many attempts were made based using classical classifiers as Decision Tree, Naïve Bayes, and k-NN for software fault prediction. But the accuracy of fault prediction is very low as defect prone modules are very small as compared to defect-free modules. So, for solving modules fault classification problems and enhancing reliability accuracy, a hybrid algorithm proposed on particle swarm optimization and modified genetic algorithm for feature selection and bagging for effective classification of defective or non-defective modules in a dataset. This paper presents an empirical study on NASA metric data program datasets, using the proposed hybrid algorithm and results showed that our proposed hybrid approach enhances the classification accuracy compared with existing methods.
Similar content being viewed by others
References
Arora HD, Kumar V, Sahni R (2014) Study of bug prediction modeling using various entropy measures-a theoretical approach. In: Proceedings of 3rd international conference on reliability, infocom technologies and optimization. IEEE, pp 1–5
Catal C, Diri B (2009) Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf Sci 179(8):1040–1058
Challagulla VUB, Bastani FB, Yen IL, Paul RA (2008) Empirical assessment of machine learning based software defect prediction techniques. Int J Artif Intell Tools 17(02):389–400
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Elish KO, Elish MO (2008) Predicting defect-prone software modules using support vector machines. J Syst Softw 81(5):649–660
Gray D, Bowes D, Davey N, Sun Y, Christianson B (2010) Software defect prediction using static code metrics underestimates defect-proneness. In: The 2010 international joint conference on neural networks (IJCNN). IEEE, pp 1–7
Jiang Y, Cuki B, Menzies T, Bartlow N (2008) Comparing design and code metrics for software quality prediction. In: Proceedings of the 4th international workshop on predictor models in software engineering. ACM, pp 11–18
Jiang S, Chin KS, Wang L, Qu G, Tsui KL (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst Appl 82:216–230
Jin X, Bie R, Gao XZ (2006) An artificial immune recognition system-based approach to software engineering management: with software metrics selection. ISDA 1:523–528
Khoshgoftaar TM, Seliya N, Liu Y (2003) Genetic programming-based decision trees for software quality classification. In: Proceedings 15th IEEE international conference on tools with artificial intelligence. IEEE, pp 374–383
Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13
Nassif AB, Azzeh M, Idri A, Abran A (2019) Software development effort estimation using regression fuzzy models. Comput Intell Neurosci 2019:8367214
NASA IV & V Facility. Metric data program. http://MDP.ivv.nasa.org/. Accessed 15 July 2019
Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. In: Liu L, Özsu MT (eds) Encyclopedia of database systems. Springer, Boston
Rodriguez D, Herraiz I, Harrison R, Dolado J, Riquelme JC (2014) Preliminary comparison of techniques for dealing with imbalance in software defect prediction. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering. ACM, p 43
Sandhu PS, Kakkar P, Sharma S (2010) A survey on software reusability. In: 2010 international conference on mechanical and electrical technology. IEEE, pp 769–773
Shepperd M, Song Q, Sun Z, Mair C (2013) Data quality: some comments on the NASA software defect datasets. IEEE Trans Softw Eng 39(9):1208–1215
Shuo W, Xin Y (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443
Turhan B, Bener A (2009) Analysis of Naive Bayes’ assumptions on software fault data: an empirical study. Data Knowl Eng 68(2):278–290
Wang T, Li WH (2010) Naive Bayes software defect prediction model. In: 2010 International conference on computational intelligence and software engineering. IEEE, pp 1–4
Yu Q, Jiang SJ, Wang RC, Wang HY (2017) A feature selection approach based on a similarity measure for software defect prediction. Front Inf Technol Electron Eng 18(11):1744–1753
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
See Table 8.
Rights and permissions
About this article
Cite this article
Banga, M., Bansal, A. & Singh, A. Proposed approach to predict software faults detection using Entropy. Int J Syst Assur Eng Manag 11 (Suppl 2), 301–312 (2020). https://doi.org/10.1007/s13198-019-00934-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13198-019-00934-2