Abstract
Bug triage is one of the crucial activities undertaken during the maintenance phase of large-scale software projects, to fix the bugs that appear. In this paper we propose an approach to automate one of the important activities of bug triage which is the bug severity assignment. The proposed approach is based on mining the historical bug repositories of software projects. It utilizes the Hierarchical Dirichlet Process (HDP) topic modeller to extract the topics shared by the historical bug reports, then categorizing them according to their proportions in the extracted topics using the K-means clustering algorithm. For each new submitted report, the top K similar reports are retrieved from their cluster using a novel weighted K-nearest neighbour algorithm that utilizes a similarity measure called Improved-Sqrt-Cosine similarity. The severity level of the new bug is assigned using a Dual-weighted voting scheme. The experimental results demonstrated that our proposed model improved the performance of the bug severity assignment task when compared against three baseline models in the context of two popular bug repositories, Eclipse and Mozilla.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yang, G., Zhang, T., Lee, B.: Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: Proceedings of the IEEE 38th Annual Computer Software and Applications Conference (COMPSAC’14) (2014)
Xuan, J., Jiang, H., Hu, Y., Ren, Z., Zou, W., Luo, Z., Wu, X.: Towards effective bug triage with software data reduction techniques. IEEE Trans. Knowl. Data Eng. (2015)
Uddin, J., Ghazali1, R., Mat Deris, M., Naseem, R., Shah, H.: A survey on bug prioritization. Artif. Intell. Rev. (2016)
Xia, X., Lo, D., Wen, M., Shihab, E., Zhou, B.: An empirical study of bug report field reassignment. In: the Proceedings of the 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (2014)
Menzies, T., Marcus, A.: Automated severity assessment of software defect reports. In: The Proceeding of IEEE International Conference on Software Maintenance (ICSM 2008), pp. 346–355, Sept 2008
Lamkanfi, A., Demeyer, S., Giger, E., Goethals, B.: Predicting the severity of a reported bug. In: The Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR’10), pp. 1–10 (2010)
Lamkanfi, A., Demeyer, S., Soetens, Q.D., Verdonck, T. : Comparing mining algorithms for predicting the severity of a reported bug. In: The Proceedings of 15th European Conference on Software Maintenance and Reengineering (CSMR), pp. 249–258 (2011)
Chaturvedi, K., Singh, V.: Determining bug severity using machine learning techniques, In: The Proceedings of the 6th Conference on Software Engineering (CONSEG) (2012)
Yang, C.-Z., Hou, C.-C., Kao, W.-C., Chen, I.-X.: An empirical study on improving severity prediction of defect reports using feature selection. In: The Proceedings of the 19th Asia-Pacific Software Engineering Conference (APSEC’12), pp. 240–249 (2012)
Sharma, G., Sharma, S., Gujral, S.: A novel way of assessing software bug severity using dictionary of critical terms. In: The Proceedings of 4th International Conference on Eco-friendly Computing and Communication Systems (ICECCS, 2015) [Proc. Comput. Sci. 70, 632–639 (2015)]
Roy, N.K.S., Rossi, B.: Towards an improvement of bug severity classification. In: 40th Euromicro Conference on Software Engineering and Advanced Applications, Italy (2014)
Tian, Y., Lo, D., Sun, C.: Information retrieval based nearest neighbour classification for fine-grained bug severity prediction. In: The Proceedings of 19th Working Conference on Reverse Engineering (WCRE), pp. 215–224 (2012)
Zhang, T., Chen, J., Yang, G., Lee, B., Luo, X.: Towards more accurate severity prediction and fixer recommendation of software bugs. J. Syst. Softw. (2016)
Hotho, A., Nurnberger, A., Paas, G.: A brief survey of text mining. J. Comput. Linguist. Lang. Technol. 19–62 (2005)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Int. J. Machine Learn. Res. 3, 993–1022 (2003)
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Statistical Assoc. 101(476) (2006)
Wallach, H.M.: Topic modelling: beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning (ICML ‘06), New York, USA (2006)
Sohangir, S., Wang, D.: Improved Sqrt‑Cosine similarity measurement. J. Big Data (2017)
Hamdy, A., Elsayed, M.: Towards more accurate automatic recommendation of software design patterns. J. Theor. Appl. Inform. Technol. 96(15), 5069–5079 (2018)
Hamdy, A., Elsayed, M.: Topic modelling for automatic selection of software design patterns, In: proceedings of International Conference on Software and Services Engineering (ICSSE), 20–22 April 2018
Gou, J., Xiong, T., Kuang, Y.: A novel weighted voting for K-nearest neighbour rule. J. Comput. (2011)
Wen, Z., Song, W., Qing, W.: BAHA: A novel approach to automatic bug report assignment with topic modeling and heterogeneous network analysis. Chin. J. Electron. 25(6) (2016)
Nguyen, A.T., Lo, D., Sun, C.: Duplicate bug report detection with a combination of information retrieval and topic modeling. In: The Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (ASE ’12), Essen, Germany, 3–7 Sept 2012
Limsettho, N., Hata, H., Monden, A., Matsumoto, K.: Unsupervised bug report categorization using clustering and labelling algorithm. Int. J. Softw. Eng. Knowl. Eng. (2016)
Nagwani, N.K., Verma, S., Mehta, K.K.: Generating taxonomic terms for software bug classification by utilizing topic models based on Latent Dirichlet Allocation. In: The Proceedings of 11th International Conference on ICT and Knowledge Engineering (2013)
Yanb, M., Zhang, X., Yang, D., Xub, L., Kymerb, J.D.: A component recommender for bug reports using discriminative probability latent semantic analysis. Inform. Softw. Technol. 37–51 (2016)
Robertson, S., Zaragoza, H., Taylor, M.: Simple BM25 extension to multiple weighted fields, pp. 42–49. In: CIKM’04 (2004)
NLTK: www.nltk.org
Porter, M.F.: An algorithm for suffix stripping. J. Program Electron. Library Inform. Syst. 40, 211–218 (2006)
Porter, M.F.: Snowball: a language for stemming algorithms (2001)
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. ICML 97, 412–420 (1997)
GENSIM: https://pypi.org/project/gensim/
Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. The MIT Press, Cambridge (2001)
Dudani, S.A.: The distance-weighted k-nearest neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6, 325–327 (1976)
Kang, P., Cho, S.: Locally linear reconstruction for instance-based learning. Pattern Recogn. 41, 3507–3518 (2008)
Zhu, S., Liu, L., Wang, Y.: Information retrieval using Hellinger distance and Sqrt-Cos similarity. In: The Proceedings of 7th International Conference on Computer Science & Education (ICCSE 2012), Melbourne, Australia, 14–17 July 2012
Hamdy, A., El-Laithy, A.: Using smote and feature reduction for more effective bug severity prediction. Int. J. Softw. Eng. Knowl. Eng. 29(6), 897–919 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Hamdy, A., El-Laithy, A. (2020). Semantic Categorization of Software Bug Repositories for Severity Assignment Automation. In: Jarzabek, S., Poniszewska-Marańda, A., Madeyski, L. (eds) Integrating Research and Practice in Software Engineering. Studies in Computational Intelligence, vol 851. Springer, Cham. https://doi.org/10.1007/978-3-030-26574-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-26574-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26573-1
Online ISBN: 978-3-030-26574-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)