Analyzing the Effectiveness of the Gaussian Mixture Model Clustering Algorithm in Software Enhancement Effort Estimation | SpringerLink
Skip to main content

Analyzing the Effectiveness of the Gaussian Mixture Model Clustering Algorithm in Software Enhancement Effort Estimation

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13758))

Included in the following conference series:

  • 809 Accesses

Abstract

Background: The influence of data clustering on the effort estimating process has been studied extensively. Studies focus on partitioning and density-based clustering, and some use hierarchical clustering, but most focus on software development effort estimation. Aim: We focus on the Gaussian Mixture Model algorithm’s effectiveness in the software enhancement effort estimation. Method: We used the Gaussian Mixture Model clustering algorithm to cluster the dataset into clusters and then applied the IFPUG FPA method for effort estimation on these clusters. The ISBSG dataset was used in this study. The number of clusters is determined using the Elbow method with the Distortion score. Besides, the k-means algorithm was also used as the comparative algorithm. The baseline model was determined by using the FPA method on the entire dataset without clustering. Result: With the number of clusters selected as 4, on six evaluation criteria, MAE, MAPE, RMSE, MBRE, and MIBRE, the experimental results show the estimated accuracy using the FPA method on clustered data significantly better when compared with no clustering. Conclusion: the software enhancement effort estimation can be significantly improved when using the Gaussian Mixture Model clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 12583
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 15729
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Vera, T., Ochoa, S.F., Perovich, D.: Survey of software development effort estimation taxonomies. Technical Report, Computer Science Department. University of Chile, Chile (2017)

    Google Scholar 

  2. Khan, B., Khan, W., Arshad, M., Jan, N.: Software cost estimation: algorithmic and non-algorithmic approaches. Int. J. Data Sci. Adv. Analytics 2(2), 1–5 (2020)

    Google Scholar 

  3. Sharma, P., Singh, J.: Systematic literature review on software effort estimation using machine learning approaches. In: International Conference on Next Generation Computing and Information Systems (ICNGCIS), pp. 43–47 (2017)

    Google Scholar 

  4. Hai, V.V., Nhung, H.L.T.K., Prokopova, Z., Silhavy, R., Silhavy, P.: A new approach to calibrating functional complexity weight in software development effort estimation. MDPI Comput. 11(15), 1–20 (2022)

    Google Scholar 

  5. Silhavy, P., Silhavy, R., Prokopova, Z.: Categorical variable segmentation model for software development effort estimation. IEEE Access 7, 9618–9626 (2019)

    Article  Google Scholar 

  6. Prokopova, Z., Silhavy, R., Silhavy, P.: The effects of clustering to software size estimation for the use case points methods. In: Silhavy, R., Silhavy, P., Prokopova, Z., Senkerik, R., Kominkova Oplatkova, Z. (eds.) CSOC 2017. AISC, vol. 575, pp. 479–490. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57141-6_51

    Chapter  Google Scholar 

  7. Bardsiri, V.K., Jawawi, D.N.A., Hashim, S.Z.M., Khatibi, E.: Increasing the accuracy of software development effort estimation using projects clustering. IET Softw. 6(6), 461–473 (2012)

    Article  Google Scholar 

  8. Van Hai, V., Nhung, H.L.T.K., Jasek, R.: Toward applying agglomerative hierarchical clustering in improving the software development effort estimation. In: Silhavy, R. (eds.) Software Engineering Perspectives in Systems. CSOC 2022. Lecture Notes in Networks and Systems, vol. 501. Springer, Cham. https://doi.org/10.1007/978-3-031-09070-7_30

  9. Lokan, C., Mendes, E.: Investigating the use of duration-based moving windows to improve software effort prediction. In: Proceedings of the 19th Asia–Pacific Software Engineering Conference (Apsec), vol. 1, pp. 818–827 (2012)

    Google Scholar 

  10. Azzeh, M., Nassif, A.B.: A hybrid model for estimating software project effort from use case points. Appl. Soft Comput. 49, 981–989 (2016)

    Article  Google Scholar 

  11. Albrecht, A.J.: Measuring application development productivity. In: Proceedings of the IBM Applications Developoment Symposium, p. 83 (1979)

    Google Scholar 

  12. IFPUG: Function Point Counting Practices Manual, Release 4.3.1, International Function Point Users Group, Westerville, Ohio, USA (2010)

    Google Scholar 

  13. ISO/IEC 20926:2009: (IFPUG) Software and systems engineering -- Software measurement – IFPUG functional size measurement method (2009)

    Google Scholar 

  14. Liu, J., Cai, D., He, X.: Gaussian mixture model with local consistency. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 24, no. 1 (2010)

    Google Scholar 

  15. Nilashi, M., Bin Ibrahim, O., Ithnin, N., Sarmin, N.H.: A multicriteria collaborative filtering recommender system for the tourism domain using Expectation Maximization (EM) and PCA–ANFIS. Electron. Commer. Res. Appl. 14(6), 542–562 (2015)

    Article  Google Scholar 

  16. Upton, G., Cook, I.: Understanding Statistics. Oxford University Press. p. 55. ISBN 0-19-914391-9

    Google Scholar 

  17. Zwillinger, D., Kokoska, S.: CRC Standard Probability and Statistics Tables and Formulae, p. 18. CRC Press (2000). ISBN 1-58488-059-7

    Google Scholar 

  18. Yellowbrick: https://www.scikit-yb.org. Accessed: May 2022

  19. Nhung, H.L.T.K., Van Hai, V., Silhavy, R., Prokopova, Z., Silhavy, P.: Parametric software effort estimation based on optimizing correction factors and multiple linear regression. IEEE Access 10, 2963–2986 (2022)

    Article  Google Scholar 

  20. Azzeh, M., Nassif, A.B., Attili, I.B.: Predicting software effort from use case points: A systematic review. Sci. Comput. Programm. 204, 10296 (2021)

    Article  Google Scholar 

  21. de Myttenaere, A., Golden, B., Le Grand, B., Rossi, F.: Mean absolute percentage error for regression models. Neurocomputing 192, 38–48 (2016)

    Article  Google Scholar 

  22. MacQueen, J.B.:Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)

    Google Scholar 

  23. Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for K-means clustering. Pattern Recogn. Lett. 25(11), 1293–1302 (2004)

    Article  Google Scholar 

Download references

Acknowledgment

This work was supported by the Faculty of Applied Informatics, Tomas Bata University in Zlin, under project IGA/CebiaTech/2022/001 and under project RVO/FAI/2021/002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vo Van Hai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Van Hai, V., Le Thi Kim Nhung, H., Prokopová, Z., Silhavy, R., Silhavy, P. (2022). Analyzing the Effectiveness of the Gaussian Mixture Model Clustering Algorithm in Software Enhancement Effort Estimation. In: Nguyen, N.T., Tran, T.K., Tukayev, U., Hong, TP., Trawiński, B., Szczerbicki, E. (eds) Intelligent Information and Database Systems. ACIIDS 2022. Lecture Notes in Computer Science(), vol 13758. Springer, Cham. https://doi.org/10.1007/978-3-031-21967-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21967-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21966-5

  • Online ISBN: 978-3-031-21967-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics