Abstract
Rust is a relatively new programming language which allows programmers to write programs that have low-level control over resources while still ensuring high-level safety guarantees (for programs written in safe Rust). Rust’s ownership framework enables programs to meet these two seemingly-contradictory goals. The Rust compiler’s Borrow-Checker component enforces the ownership framework requirements that ensure Rust’s safety guarantees. Rust is popular: as of 2022, it has ranked first, for the seventh consecutive year, in Stack Overflow’s annual Developer Survey as the most-loved programming language. The number of Rust developers is growing as the need for faster and safer software increases. Yet, to our knowledge, no research has sought to identify the most pervasive bug fix patterns within Rust programs. In this project, we introduce Ruxanne, a tool for analyzing and extracting fix patterns in Rust. Ruxanne implements a novel embedding of Rust code into fixed-sized vectors. Using Ruxanne, we mined the top 18 most-starred Rust projects in GitHub to discover the most common bug fix patterns committed to their repositories. We analyzed 87,726 code changes drawn from 57,214 commits across these 18 projects. After clustering the code changes, and conducting a manual analysis, we identified 20 groups of cross-project bug fix patterns, which we categorize as (1) general patterns and (2) borrow-checker-related patterns. Among the general patterns, the most frequently observed pattern is when the user either adds or removes struct fields. In the case of borrow-checker-related patterns, the most common pattern we encountered is when the user removes a clone() call. We describe all detected patterns and their implications to automated program repair.
Similar content being viewed by others
Data Availability Statement
The datasets generated and analyzed during the current study are available in the Zenodo repository, https://zenodo.org/record/8052979.
Notes
At the extreme, a fault could be caused by moths or other hardware malfunctions.
References
Alon U, Zilberstein M, Levy O, Yahav E (2018) A general path-based representation for predicting program properties. ACM SIGPLAN Not 53(4):404–419
Alon U, Zilberstein M, Levy O, Yahav E (2019a) code2seq: Generating sequences from structured representations of code. In: Proceedings of the 2019 Conference of the Association for Computational Linguistics (ACL). pp 6304–6315
Alon U, Zilberstein M, Levy O, Yahav E (2019b) code2vec: Learning distributed representations of code. Proc ACM Program Lang 3(POPL):1–29
Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering. pp 1–10
Bielik P, Raychev V, Vechev M (2016) PHOG: Probabilistic model for code. In: International conference on machine learning. pp 2933–2942
Campos EC, Maia MA (2019) Discovering common bug-fix patterns: A large-scale observational study. J Softw: Evol Process 31(7):1–28
Cannon L, Elliott R, Kirchhoff L, Miller J, Milner J, Mitze R, Schan E, Whittington N, Spencer H, Keppel D et al (1991) Recommended C style and coding standards. Pocket reference guide, Specialized Systems Consultants
Chen Z, Monperrus M (2019) A literature study of embeddings on source code. arXiv:1904.03061
Collins CR, Stephenson K (2003) A circle packing algorithm. Comput Geom 25(3):233–256
Cotroneo D, De Simone L, Iannillo A K, Natella R, Rosiello S, Bidokhti N (2019) Analyzing the context of bug-fixing changes in the OpenStack cloud computing platform. In: 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, pp 334–345
DeGroot M H, Schervish M J (2012) Probability and statistics. Pearson Education
Endres A (1975) An analysis of errors and their causes in system programs. IEEE Trans Softw Eng 1(1):140–149
Ester M, Kriegel H-P, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD. pp 226–231
Eyolfson J (2018) Enforcing Abstract Immutability. PhD thesis, University of Waterloo
Flanagan C, Felleisen M (1998) A new way of debugging Lisp programs. In: Proceedings of Lisp Users’ Group Meeting (LUGM)
Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. pp 947–954
Gopinath R, Jensen C, Groce A et al (2015) Mutant census: An empirical examination of the competent programmer hypothesis. Technical Report, School of EECS, Oregon State University
Hanam Q, Brito FSd M, Mesbah A (2016) Discovering bug patterns in JavaScript. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. pp 144–156
Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun ACM 59(5):122–131
Hoang T, Kang H J, Lo D, Lawall J (2020) CC2Vec: Distributed representations of code changes. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering. pp 518–529
Huang W, Milanova A, Dietl W, Ernst MD (2012) ReIm & ReImInfer: Checking and inference of reference immutability and method purity. OOPSLA 2012, Object-Oriented Programming Systems, Languages, and Applications. Tucson, AZ, USA, pp 879–896
Islam MR, Zibran MF (2021) What changes in where? An empirical study of bug-fixing change patterns. ACM SIGAPP Appl Comput Rev 20(4):18–34
Jeffrey D, Feng M, Gupta N, Gupta R (2009) Bugfix: A learning-based tool to assist developers in fixing bugs. In: 2009 IEEE 17th international conference on program comprehension. IEEE, pp 70–79
Jones J A, Harrold M J (2005) Empirical evaluation of the Tarantula automatic fault-localization technique. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering. pp 273–282
Klabnik S, Nichols C (2019) The Rust programming language (Covers Rust 2018). No Starch Press
Knuth DE (1989) The errors of TeX. Softw-Pract Exper 19(7):607–685
Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: 2012 34th International Conference on Software Engineering (ICSE). IEEE, pp 3–13
Le Goues C, Pradel M, Roychoudhury A (2019) Automated program repair. Commun ACM 62(12):56–65
Li Z, Wang J, Sun M, Lui J C (2021) MirChecker: Detecting bugs in rust programs via static analysis. In: Proceedings of the 2021 ACM SIGSAC conference on computer and communications security. pp 2183–2196
Lin B, Wang S, Wen M, Mao X (2022) Context-aware code change embedding for better patch correctness assessment. ACM Trans Softw Eng Methodol (TOSEM) 31(3):1–29
Ling M, Yu Y, Wu H, Wang Y, Cordy J R, Hassan A E (2022) In Rust we trust: a transpiler from unsafe C to safer Rust. In: Proceedings of the ACM/IEEE 44th international conference on software engineering: companion proceedings. pp 354–355
Liu Y, Zhang L, Zhang Z (2018) A survey of test based automatic program repair. J. Softw. 13(8):437–452
Madeiral F, Durieux T, Sobreira V, Maia M (2018) Towards an automated approach for bug fix pattern detection. arXiv:1807.11286
Martinez M, Monperrus M (2012) Mining repair actions for guiding automated program fixing. PhD thesis, Inria
Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Emp Softw Eng 20(1):176–205
Monperrus M (2014) “A critical review of automatic patch generation learned from human-written patches”: Essay on the problem statement and the evaluation of automatic software repair. In: Proceedings of the 36th international conference on software engineering. pp 234–242
Moss S (2021) How Dropbox pulled off its hybrid cloud transition. https://www.datacenterdynamics.com/en/analysis/how-dropbox-pulled-off-its-hybrid-cloud-transition/. November 21, 2022
Naish L, Lee H J, Ramamohanarao K (2009) Spectral debugging with weights and incremental ranking. In: 2009 16th Asia-pacific software engineering conference. IEEE, pp 168–175
Nguyen T, Weimer W, Le Goues C, Forrest S (2009) Using execution paths to evolve software patches. In: 2009 International conference on software testing, verification, and validation workshops. IEEE, pp 152–153
Pan K, Kim S, Whitehead EJ (2009) Toward an understanding of bug fix patterns. Emp Softw Eng 14(3):286–315
Qi Y, Mao X, Lei Y (2013) Efficient automated program repair through fault-recorded testing prioritization. In 2013 IEEE International Conference on Software Maintenance. IEEE, pp 180–189
Qi Y, Mao X, Lei Y, Dai Z, Wang C (2014) The strength of random search on automated program repair. In Proceedings of the 36th International Conference on Software Engineering. pp 254–265
Qin B, Chen Y, Yu Z, Song L, Zhang Y (2020) Understanding memory and thread safety practices and issues in real-world Rust programs. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. pp 763–779
Raychev V, Bielik P, Vechev M, Krause A (2016) Learning programs from noisy data. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’16. page 761-774, Association for Computing Machinery, New York, NY, USA
Sam G, Cameron N, Potanin A (2017) Automated refactoring of Rust programs. In Proceedings of the Australasian Computer Science Week Multiconference. pp 1–9
Spadini D, Aniche M, Bacchelli A (2018) Pydriller: Python framework for mining software repositories. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. pp 908–911
Tan S H, Roychoudhury A (2015) relifix: Automated repair of software regressions. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering volume 1. IEEE, pp 471–482
Tian H, Tang X, Habib A, Wang S, Liu K, Xia X, Klein J, Bissyandé T F (2022) Is this change the answer to that problem? Correlating descriptions of bug and code changes for evaluating patch correctness. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. pp 1–13
Wong WE, Gao R, Li Y, Abreu R, Wotawa F (2016) A survey on software fault localization. IEEE Trans Softw Eng 42(8):707–740
Xie X, Chen TY, Kuo F-C, Xu B (2013) A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans Softw Eng Methodol (TOSEM) 22(4):1–40
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Yang Y, He T, Feng Y, Liu S, Xu B (2022) Mining Python fix patterns via analyzing fine-grained source code changes. Emp Softw Eng 27(2):1–37
Ye H, Gu J, Martinez M, Durieux T, Monperrus M (2021) Automated classification of overfitting patches with statically extracted code features. IEEE Trans Softw Eng 48(8):2920–2938
Zhang Y, Chen Y, Cheung S-C, Xiong Y, Zhang L (2018) An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp 129–140
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing Interests
We have no competing interests and are funded by a Discovery Grant from Canada’s Natural Science and Engineering Research Council
Additional information
Communicated by: Martin Monperrus.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Robati Shirzad, M., Lam, P. A study of common bug fix patterns in Rust. Empir Software Eng 29, 44 (2024). https://doi.org/10.1007/s10664-023-10437-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10437-1