Abstract
Background: The influence of data clustering on the effort estimating process has been studied extensively. Studies focus on partitioning and density-based clustering, and some use hierarchical clustering, but most focus on software development effort estimation. Aim: We focus on the Gaussian Mixture Model algorithm’s effectiveness in the software enhancement effort estimation. Method: We used the Gaussian Mixture Model clustering algorithm to cluster the dataset into clusters and then applied the IFPUG FPA method for effort estimation on these clusters. The ISBSG dataset was used in this study. The number of clusters is determined using the Elbow method with the Distortion score. Besides, the k-means algorithm was also used as the comparative algorithm. The baseline model was determined by using the FPA method on the entire dataset without clustering. Result: With the number of clusters selected as 4, on six evaluation criteria, MAE, MAPE, RMSE, MBRE, and MIBRE, the experimental results show the estimated accuracy using the FPA method on clustered data significantly better when compared with no clustering. Conclusion: the software enhancement effort estimation can be significantly improved when using the Gaussian Mixture Model clustering algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Vera, T., Ochoa, S.F., Perovich, D.: Survey of software development effort estimation taxonomies. Technical Report, Computer Science Department. University of Chile, Chile (2017)
Khan, B., Khan, W., Arshad, M., Jan, N.: Software cost estimation: algorithmic and non-algorithmic approaches. Int. J. Data Sci. Adv. Analytics 2(2), 1–5 (2020)
Sharma, P., Singh, J.: Systematic literature review on software effort estimation using machine learning approaches. In: International Conference on Next Generation Computing and Information Systems (ICNGCIS), pp. 43–47 (2017)
Hai, V.V., Nhung, H.L.T.K., Prokopova, Z., Silhavy, R., Silhavy, P.: A new approach to calibrating functional complexity weight in software development effort estimation. MDPI Comput. 11(15), 1–20 (2022)
Silhavy, P., Silhavy, R., Prokopova, Z.: Categorical variable segmentation model for software development effort estimation. IEEE Access 7, 9618–9626 (2019)
Prokopova, Z., Silhavy, R., Silhavy, P.: The effects of clustering to software size estimation for the use case points methods. In: Silhavy, R., Silhavy, P., Prokopova, Z., Senkerik, R., Kominkova Oplatkova, Z. (eds.) CSOC 2017. AISC, vol. 575, pp. 479–490. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57141-6_51
Bardsiri, V.K., Jawawi, D.N.A., Hashim, S.Z.M., Khatibi, E.: Increasing the accuracy of software development effort estimation using projects clustering. IET Softw. 6(6), 461–473 (2012)
Van Hai, V., Nhung, H.L.T.K., Jasek, R.: Toward applying agglomerative hierarchical clustering in improving the software development effort estimation. In: Silhavy, R. (eds.) Software Engineering Perspectives in Systems. CSOC 2022. Lecture Notes in Networks and Systems, vol. 501. Springer, Cham. https://doi.org/10.1007/978-3-031-09070-7_30
Lokan, C., Mendes, E.: Investigating the use of duration-based moving windows to improve software effort prediction. In: Proceedings of the 19th Asia–Pacific Software Engineering Conference (Apsec), vol. 1, pp. 818–827 (2012)
Azzeh, M., Nassif, A.B.: A hybrid model for estimating software project effort from use case points. Appl. Soft Comput. 49, 981–989 (2016)
Albrecht, A.J.: Measuring application development productivity. In: Proceedings of the IBM Applications Developoment Symposium, p. 83 (1979)
IFPUG: Function Point Counting Practices Manual, Release 4.3.1, International Function Point Users Group, Westerville, Ohio, USA (2010)
ISO/IEC 20926:2009: (IFPUG) Software and systems engineering -- Software measurement – IFPUG functional size measurement method (2009)
Liu, J., Cai, D., He, X.: Gaussian mixture model with local consistency. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 24, no. 1 (2010)
Nilashi, M., Bin Ibrahim, O., Ithnin, N., Sarmin, N.H.: A multicriteria collaborative filtering recommender system for the tourism domain using Expectation Maximization (EM) and PCA–ANFIS. Electron. Commer. Res. Appl. 14(6), 542–562 (2015)
Upton, G., Cook, I.: Understanding Statistics. Oxford University Press. p. 55. ISBN 0-19-914391-9
Zwillinger, D., Kokoska, S.: CRC Standard Probability and Statistics Tables and Formulae, p. 18. CRC Press (2000). ISBN 1-58488-059-7
Yellowbrick: https://www.scikit-yb.org. Accessed: May 2022
Nhung, H.L.T.K., Van Hai, V., Silhavy, R., Prokopova, Z., Silhavy, P.: Parametric software effort estimation based on optimizing correction factors and multiple linear regression. IEEE Access 10, 2963–2986 (2022)
Azzeh, M., Nassif, A.B., Attili, I.B.: Predicting software effort from use case points: A systematic review. Sci. Comput. Programm. 204, 10296 (2021)
de Myttenaere, A., Golden, B., Le Grand, B., Rossi, F.: Mean absolute percentage error for regression models. Neurocomputing 192, 38–48 (2016)
MacQueen, J.B.:Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for K-means clustering. Pattern Recogn. Lett. 25(11), 1293–1302 (2004)
Acknowledgment
This work was supported by the Faculty of Applied Informatics, Tomas Bata University in Zlin, under project IGA/CebiaTech/2022/001 and under project RVO/FAI/2021/002.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Van Hai, V., Le Thi Kim Nhung, H., Prokopová, Z., Silhavy, R., Silhavy, P. (2022). Analyzing the Effectiveness of the Gaussian Mixture Model Clustering Algorithm in Software Enhancement Effort Estimation. In: Nguyen, N.T., Tran, T.K., Tukayev, U., Hong, TP., Trawiński, B., Szczerbicki, E. (eds) Intelligent Information and Database Systems. ACIIDS 2022. Lecture Notes in Computer Science(), vol 13758. Springer, Cham. https://doi.org/10.1007/978-3-031-21967-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-21967-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21966-5
Online ISBN: 978-3-031-21967-2
eBook Packages: Computer ScienceComputer Science (R0)