Abstract
Recent machine learning studies present accurate results generating prediction models to identify refactoring operations for a program. However, such works are limited to prediction, i.e., they learn refactoring operations strictly as applied by developers, but there are possibilities that they might not think. On the other hand, the Search-Based Software Refactoring (SBR) field applies search algorithms to find refactoring operations in a vast space of possibilities to improve diverse quality attributes. Nevertheless, existing SBR approaches do not generate a model as machine learning studies, and then, they need to be reapplied individually for each program needing refactoring. To mitigate this limitation, this work introduces a novel SBR learning approach that generates refactoring algorithms capable of providing refactoring operations to several programs. These algorithms are composed of procedures that use rules to determine the refactoring operations. To create the algorithms, a learning process first extracts refactoring patterns from programs by grouping their elements that were refactored in similar ways. After that, a Grammatical Evolution (GE) is applied to generate the algorithms based on a grammar encompassing details of the extracted patterns. GE works to generate an algorithm that provides refactoring operations similar to those applied in practice while improving quality attributes, such as modularity. The approach is evaluated using refactoring data from 40 Java programs of GitHub repositories. The algorithms are tested against different programs, obtaining an overall average of 60% of modularity improvement and 50% of similarity with actual refactoring operations.
Similar content being viewed by others
References
AutoRefactor (2021). Available at: http://autorefactor.org/. Accessed on March 28
Spartan Refactoring (2021). Available at: https://marketplace.eclipse.org/content/spartan-refactoring. Accessed on March 28
Abid C, Alizadeh V, Kessentini M, Ferreira T N, Dig D (2020) 30 years of software refactoring research: A systematic literature review. CoRR abs/2007.02194
Al Dallal J (2012) Constructing models for predicting extract subclass refactoring opportunities using object-oriented quality metrics. Inf Softw Technol 54 (10):1125–1141
Alenezi M, Akour M, Alqasem O (2020) Harnessing deep learning algorithms to predict software refactoring. TELKOMNIKA (Telecommunication Computing Electronics and Control) 18:2977–2982. https://doi.org/10.12928/TELKOMNIKA.v18i6.16743
Alizadeh V, Kessentini M, Mkaouer M W, Ocinneide M, Ouni A, Cai Y (2020) An interactive and dynamic search-based approach to software refactoring recommendations. IEEE Trans Softw Eng 46(9):932–961. https://doi.org/10.1109/TSE.2018.2872711
AlOmar E A, Peruma A, Newman C D, Mkaouer M W, Ouni A (2020) On the relationship between developer experience and refactoring: An exploratory study and preliminary results. In: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops. ICSEW’20. Association for Computing Machinery, New York, NY, USA, pp 342–349
Amal B, Kessentini M, Bechikh S, Dea J, Said L B (2014) On the use of machine learning and search-based software engineering for ill-defined fitness function: A case study on software refactoring. In: Le Goues C, Yoo S (eds) Search-Based Software Engineering. Springer International Publishing, Cham, pp 31–45
Aniche M, Maziero E, Durelli R, Durelli V (2020) The effectiveness of supervised machine learning algorithms in predicting software refactoring. IEEE Trans Softw Eng, pp 1–1. https://doi.org/10.1109/TSE.2020.3021736
Bansiya J, Davis C G (2002) A hierarchical model for object-oriented design quality assessment. IEEE Trans Softw Eng 28(1):4–17
Baqais A, Alshayeb M (2020) Automatic software refactoring: a systematic literature review. Softw Qual J 28:459–502
Barros R C, Basgalupp M P, Cerri R, da Silva T S, de Carvalho A C P L F (2013) A Grammatical Evolution Approach for Software Effort Estimation. In: Proceedings of the 5th Genetic and Evolutionary Computation Conference. GECCO
Catolino G, Palomba F, Fontana F A, De Lucia A, Andy Z, Ferrucci F (2020) Improving change prediction models with code smell-related information. Empirical Software Engineer 25:49–95. https://doi.org/10.1007/s10664-019-09739-0
Cohen J (2013) Statistical power analysis for the behavioral sciences. Academic press
Colanzi T E, Assunção W K G, Farah P R , Vergilio S R, Guizzo G (2019) A review of ten years of the symposium on search-based software engineering. In: Nejati S, Gay G (eds) Symposium on Search-Based Software Engineering. Springer, Cham, pp 42–57
Cormen T H, Leiserson C E, Rivest R L, Stein C (2009) Introduction to algorithms, edn. 3. The MIT Press
Dallal J A (2017) Predicting move method refactoring opportunities in object-oriented code. Inf Softw Technol 92:105–120
Dempster A P, Laird N M, Rubin D B (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B 39(1):1–38
Durillo J J, Nebro A J (2011) jMetal: A Java framework for multi-objective optimization. Adv Eng Softw 42:760–771
Fowler M, Beck K (2018) Refactoring: Improving the Design of Existing Code, edn. 2. Addison-Wesley
Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: Elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA
Hartigan J A, Wong M A (1979) Algorithm as 136: A k-means clustering algorithm. J R Stat Soc 28(1):100–108
Imazato A, Higo Y, Hotta K, Kusumoto S (2017) Finding extract method refactoring opportunities by analyzing development history. In: Proceedings of the 41st Annual Computer Software and Applications Conference. COMPSAC
Jindal S, Khurana G (2013) The statistical analysis of source-code to determine the refactoring opportunities factor (ROF) using a machine learning algorithm. In: Proceedings of the International Conference on Advances in Recent Technologies in Communication and Computing. ARTCom
Kaur A, Dhiman G (2019) A review on search-based tools and techniques to identify bad code smells in object-oriented systems. In: Yadav N, Yadav A, Bansal J C, Deep K, Kim J H (eds) Harmony Search and Nature Inspired Optimization Algorithms. Springer Singapore, Singapore, pp 909–921
Kessentini M, Mahouachi R, Ghedira K (2012) What you like in design use to correct bad-smells. Softw Qual J 21(4):551–571
Kim M, Zimmermann T, Nagappan N (2014) An empirical study of refactoring challenges and benefits at Microsoft. IEEE Trans Softw Eng 40(7):633–649
Koc E, Ersoy N, Andac A, Camlidere Z S, Cereci I, Kilic H (2011) An empirical study about search-based refactoring using alternative multiple and population-based search techniques. In: Proceedings of the International Symposium on Computer and Information Sciences. ISCIS, pp 59–66
Koc E, Ersoy N, Camlidere Z S, Kilic H (2012) A Web-Service for Automated Software Refactoring Using Artificial Bee Colony Optimization. In: Proceedings of the International Conference on Advances in Swarm Intelligence. ICSI, pp 318–325
Kosker Y, Turhan B, Bener A (2009) An expert system for determining candidate software classes for refactoring. Expert Syst Appl 36(6):10000–10003
Koza J R (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press
Kumar L, Satapathy S M, Murthy L B (2019) Method level refactoring prediction on five open source java projects using machine learning techniques. In: Proceedings of the 12th Innovations on Software Engineering Conference. ISEC
Lance G N, Williams W T (1967) A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems. The Computer Journal 9(4):373–380
Mahouachi R, Kessentini M, Ghedira K (2012) A new design defects classification: Marrying detection and correction. In: Proceedings of the Fundamental Approaches to Software Engineering. FASE
Mann H B, Whitney D R (1947) On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics 18(1):50–60
Mansoor U, Kessentini M, Wimmer M, Deb K (2015) Multi-view refactoring of class and activity diagrams using a multi-objective evolutionary algorithm. Softw Qual J, pp 1–29
Mariani T, Guizzo G, Vergilio S R, Pozo A T R (2016) Grammatical evolution for the multi-objective integration and test order problem. In: Genetic and Evolutionary Computation Conference. GECCO, pp 1069–1076
Mariani T, Kessentini M, Vergilio S R (2021) Dataset and Suplementary Material. https://doi.org/10.6084/m9.figshare.12275981
Mariani T, Vergilio S R (2016) A systematic review on search-based refactoring. Inf Softw Technol 83:14–34
Mkaouer M W, Kessentini M, Bechikh S, Cinnéide M O, Deb K (2015) On the use of many quality attributes for software refactoring: a many-objective search-based software engineering approach. Empir Softw Eng, pp 1–43
Mkaouer M W, Kessentini M, Bechikh S, Deb K, Ó Cinnéide M (2014) Recommendation system for software refactoring using innovization and interactive dynamic optimization. In: Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. ACM, pp 331–336
Mkaouer M W, Kessentini M, Bechikh S, Deb K, Ó Cinnéide M (2014) Recommendation system for software refactoring using innovization and interactive dynamic optimization. In: Proceedings of the International Conference on Automated Software Engineering. ASE, pp 331–336
Mkaouer W, Kessentini M, Kontchou P, Deb K, Bechikh S, Ouni A (2015) Many-Objective Software Remodularization Using NSGA-III. Transactions on Software Engineering and Methodology 24(3):17:1–17:45
Mohan M, Greer D (2018) A survey of search-based refactoring for software maintenance. Journal of Software Engineering Research and Development 6:3:1 – 3:52
Moore I (1996) Automatic inheritance hierarchy restructuring and method refactoring. In: Proceedings of the 11th Conference on Object-oriented Programming, Systems, Languages, and Applications. OOPSLA
Murphy-Hill E, Parnin C, Black A P (2012) How we refactor, and how we know it. IEEE Trans Softw Eng 38(1):5–18
Ouni A, Kessentini M, Sahraoui H (2013) Search-based refactoring using recorded code changes. In: Proceedings of the European Conference on Software Maintenance and Reengineering. CSMR
Ouni A, Kessentini M, Sahraoui H (2014) Multiobjective optimization for software refactoring and evolution. Adv Comput 94:103–167
Ouni A, Kessentini M, Sahraoui H, Hamdi M S (2013) The use of development history in software refactoring using a multi-objective evolutionary algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference. GECCO
Ouni A, Kessentini M, Sahraoui H, Inoue K, Deb K (2016) Multi-criteria code refactoring using search-based software engineering: An industrial case study. ACM Trans Softw Eng Methodol 25(3):23:1–23:53
Ouni A, Kessentini M, Sahraoui H, Inoue K, Hamdi M S (2015) Improving multi-objective code-smells correction using development history. J Syst Softw 105:18–39
Paixao M, Harman M, Zhang Y, Yu Y (2018) An empirical study of cohesion and coupling: Balancing optimization and disruption. IEEE Trans Evol Comput 22(3):394–414
Phongpaibul M, Boehm B (2007) Mining software evolution to predict refactoring. In: Proceedings of the International Symposium on Empirical Software Engineering and Measurement. ESEM
Powers D M W (2011) Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. J Mach Learn Technol 2:37–63
Ryan C, Collins J J, Neill M O (1998) Grammatical evolution: Evolving programs for an arbitrary language. In: Genetic Programming. Lecture Notes in Computer Science, vol 1391. Springer, Berlin Heidelberg, pp 83–96
Silva D, Tsantalis N, Valente M T (2016) Why we refactor? confessions of github contributors. In: Proceedings of the 24th International Symposium on Foundations of Software Engineering. FSE, pp 858–870
Sjøberg D I K, Yamashita A, Anda B C D, Mockus A, Dybå T (2013) Quantifying the effect of code smells on maintenance effort. IEEE Trans Softw Eng 39(8):1144–1156
Tan P-N, Steinbach M, Kumar V (2005) Introduction to data mining. Addison-Wesley
Tsantalis N, Chaikalis T, Chatzigeorgiou A (2008) JDeodorant: Identification and removal of type-checking bad smells. In: Proceedings of the 12th European Conference on Software Maintenance and Reengineering. CSMR
Tufano M, Pantiuchina J, Watson C, Bavota G, Poshyvanyk D (2019) On learning meaningful code changes via neural machine translation. In: Proceedings of the 41st International Conference on Software Engineering. ICSE ’19, pp 25–36
Wang H, Kessentini M, Grosky W, Meddeb H (2015) On the use of time series and search based software engineering for refactoring recommendation. In: Proceedings of the 7th International Conference on Management of Computational and Collective IntElligence in Digital EcoSystems. MEDES ’15. https://doi.org/10.1145/2857218.2857224. Association for Computing Machinery, New York, NY, USA, pp 35–42
Witten I H, Frank E (1999) Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann
Wohlin C, Runeson P, Höst M, Ohlsson M C, Regnell B, Wesslén A (2012) Experimentation in software engineering. Springer Science & Business Media
Xu S, Sivaraman A, Khoo S-C, Xu J (2017) GEMS: An extract method refactoring recommender. In: Proceedings of the 28th International Symposium on Software Reliability Engineering. ISSRE
Acknowledgements
The authors would like to thank to CAPES by supporting Thainá Mariani by the program PDSE associated with the process 88881.135198/2016-01. Silvia R. Vergilio is supported by CNPq (Grant:305968/2018).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
We declare that we have no conflict of interests.
Additional information
Communicated by: Aldeida Aleti, Annibale Panichella and Shin Yoo
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Advances in Search-Based Software Engineering (SSBSE)
Rights and permissions
About this article
Cite this article
Mariani, T., Kessentini, M. & Vergilio, S.R. Generation of refactoring algorithms by grammatical evolution. Empir Software Eng 27, 110 (2022). https://doi.org/10.1007/s10664-022-10151-4
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-022-10151-4