Abstract
Automated software transformations and the process of automated program repair and improvement are an important aspect of modern software engineering practices. In this paper, we describe a system which uses a graph-based deep learning model that can be trained to automatically transform and improve computer programs. By operating on language-agnostic, universal graph-like structures easily extractable from source code files (abstract syntax trees), the deep learning agent learns which transformations should be effectively applied to various structures recognized in the source code in order to improve it. By defining a metric which we want to improve and introducing an optimization task—a reinforcement learning setting, an agent learns to automatically apply a chain of transformations to the program, drastically improving it. While similar program improvement processes exist, they exclusively use exhaustive search algorithms to try all the possible code transformations which is a long process susceptible to local optimum issues. Our solution aims to model and embed structural knowledge about the programs being transformed which greatly helps the agent to choose best possible code transformations to apply. Elements of the approach we present in this paper are further applicable not just to automatic software improvement tasks, but also to other code-related tasks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
All software presented in this paper is Free and Open Source software under the GNU Public License. Our source code, datasets, results and instructions for reproducing the results are available in the following GitHub repository: https://github.com/nsukur/rl. We also provide a training set of intermediate graphs for training of the embedding model, which is linked in the aforementioned GitHub repository.
References
Allamanis M, Brockschmidt M, Khademi M (2017) Learning to represent programs with graphs. arXiv preprint arXiv:1711.00740
Alon U, Zilberstein M, Levy O, Yahav E (2019) code2vec: Learning distributed representations of code. In: Proceedings of the ACM on programming languages 3(POPL), 1–29
Arcuri A, Yao X (2008) A novel co-evolutionary approach to automatic software bug fixing. In: Evolutionary computation, 2008. CEC 2008.(IEEE world congress on computational intelligence). IEEE Congress On, pp 162–168. IEEE
Balog M, Gaunt AL, Brockschmidt M, Nowozin S, Tarlow D (2016) Deepcoder: learning to write programs. arXiv preprint arXiv:1611.01989
Ben-Nun T, Jakobovits AS, Hoefler T (2018) Neural code comprehension: a learnable representation of code semantics. Adv Neural Inf Process Syst 31
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym. arXiv preprint arXiv:1606.01540
Bunel R, Hausknecht M, Devlin J, Singh R, Kohli P (2018) Leveraging grammar and reinforcement learning for neural program synthesis. arXiv preprint arXiv:1805.04276
Campbell GA, SonarSource S (2020) Cognitive complexity. SonarSource, Geneva
Cummins C, Fisches ZV, Ben-Nun T, Hoefler T, Leather H (2020) Programl: Graph-based deep learning for program optimization and analysis. arXiv preprint arXiv:2003.10536
Dasgupta SS, Ray SN, Talukdar P (2018) Hyte: Hyperplane-based temporally aware knowledge graph embedding. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2001–2011
Ebert C, Cain J, Antoniol G, Counsell S, Laplante P (2016) Cyclomatic complexity. IEEE softw 33(6):27–29
Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428
Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, pp 947–954. ACM
Fu C, Chen H, Liu H, Chen X, Tian Y, Koushanfar F, Zhao J (2019) Coda: an end-to-end neural program decompiler. Adv Neural Inf Process Syst 32
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Gupta R, Pal S, Kanade A, Shevade S (2017) Deepfix: fixing common c language errors by deep learning. In: Thirty-first AAAI conference on artificial intelligence
Hagberg A, Swart P, S Chult D (2008) Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States)
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inf Process Syst 30
Harrand N, Soto-Valero C, Monperrus M, Baudry B (2020) Java decompiler diversity and its application to meta-decompilation. J Syst Softw 168:110645
Heo K, Lee W, Pashakhanloo P, Naik M (2018) Effective program debloating via reinforcement learning. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pp 380–394
Huang S, Ontañón S (2020) A closer look at invalid action masking in policy gradient algorithms. arXiv preprint arXiv:2006.14171
Katz DS, Ruchti J, Schulte E (2018) Using recurrent neural networks for decompilation. In: 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER), pp 346–356. IEEE
Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 international conference on software engineering. ICSE ’13, pp 802–811. IEEE Press, San Francisco, CA, USA
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Le Goues C, Nguyen T, Forrest S, Weimer W (2012) Genprog: a generic method for automatic software repair. IEEE Trans Software Eng 38(1):54
Liang C, Berant J, Le Q, Forbus KD, Lao N (2016) Neural symbolic machines: learning semantic parsers on freebase with weak supervision. arXiv preprint arXiv:1611.00020
Li Z, Wu Q, Qian K (2019) Adabot: Fault-tolerant java decompiler. arXiv preprint arXiv:1908.06748
Małkowski A, Grzechociński J, Wawrzyński P (2022) Graph autoencoder with constant dimensional latent space. arXiv preprint arXiv:2201.12165
Nie M, Chen D, Wang D (2023) Reinforcement learning on graphs: a survey. IEEE Trans Emerg Top Comput Intell. https://doi.org/10.1109/TETCI.2022.3222545
Parisotto E, Mohamed A-r, Singh R, Li L, Zhou D, Kohli P (2016) Neuro-symbolic program synthesis. arXiv preprint arXiv:1611.01855
Pawlak R, Monperrus M, Petitprez N, Noguera C, Seinturier L (2015) Spoon: a library for implementing analyses and transformations of java source code. Softw Pract Exp 46:1155–1179. https://doi.org/10.1002/spe.2346
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710
Pracner D, Budimac Z (2017) Enabling code transformations with fermat on simplified bytecode. J Softw Evol Process 29(5):1857. https://doi.org/10.1002/smr.1857
Raffin A, Hill A, Ernestus M, Gleave A, Kanervisto A, Dormann N (2019) Stable baselines 3
Russell SJ, Norvig P (2016) Artificial intelligence: a modern approach. Pearson Higher Ed, London
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Shin ECR, Song D, Moazzezi R (2015) Recognizing functions in binaries with neural networks. In: 24th USENIX security symposium (USENIX Security 15), pp 611–626
Sukur N, Pracner D, Budimac Z (2019) Fitness functions and transformations in an automated process. In: Budimac Z, Koteska B (eds) SQAMIA 2019, 8th workshop of software quality, analysis, monitoring, improvement, and applications, pp 17–011710. CEUR-WS.org, Ohrid, North Macedonia. http://ceur-ws.org/Vol-2217/
Tang C-Y, Liu C-H, Chen W-K, You SD (2020) Implementing action mask in proximal policy optimization (ppo) algorithm. ICT Express 6(3):200–203
Wang X, Lyu D, Li M, Xia Y, Yang Q, Wang X, Wang X, Cui P, Yang Y, Sun B et al (2021) Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In: Proceedings of the 2021 international conference on management of data, pp 2628–2638
Wang H, Wang S, Xu D, Zhang X, Liu X (2020) Generating effective software obfuscation sequences with reinforcement learning. IEEE Trans Depend Secure Comput
Ward M (2013) Assembler restructuring in fermat. In: SCAM, pp 147–156. IEEE, Eindhoven, Netherlands. https://doi.org/10.1109/SCAM.2013.6648196
Watson AH, Wallace DR, McCabe TJ (1996) Structured testing: A testing methodology using the cyclomatic complexity metric. Special Publication (NIST SP), National Institute of Standards and Technology, Gaithersburg, MD, Gaithersburg, MD, USA
Yuan Y, Banzhaf W (2018) Arja: automated repair of java programs via multi-objective genetic programming. IEEE Trans Software Eng 46(10):1040–1067. https://doi.org/10.1109/TSE.2018.2874648
Yuan Y, Banzhaf W (2019) Toward better evolutionary program repair: an integrated approach. ACM Trans Softw Eng Methodol. https://doi.org/10.1145/3360004
Zhang L, Rosenblatt G, Fetaya E, Liao R, Byrd W, Might M, Urtasun R, Zemel R (2018) Neural guided constraint logic programming for program synthesis. Adv Neural Inf Process Syst 31
Acknowledgements
The authors gratefully acknowledge the financial support of the Ministry of Science, Technological Development and Innovation of the Republic of Serbia (Grant No. 451-03-47/2023-01/200125).
Funding
This study was funded with the financial support of the Ministry of Science, Technological Development and Innovation of the Republic of Serbia (Grant No. 451-03-47/2023-01/200125)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors of this manuscript declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
This article does not contain any studies with human participants performed by any of the authors, therefore informed consent was not required.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sukur, N., Milošević, N., Pracner, D. et al. Automated program improvement with reinforcement learning and graph neural networks. Soft Comput 28, 2593–2604 (2024). https://doi.org/10.1007/s00500-023-08559-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-08559-1