A study of common bug fix patterns in Rust | Empirical Software Engineering Skip to main content
Log in

A study of common bug fix patterns in Rust

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Rust is a relatively new programming language which allows programmers to write programs that have low-level control over resources while still ensuring high-level safety guarantees (for programs written in safe Rust). Rust’s ownership framework enables programs to meet these two seemingly-contradictory goals. The Rust compiler’s Borrow-Checker component enforces the ownership framework requirements that ensure Rust’s safety guarantees. Rust is popular: as of 2022, it has ranked first, for the seventh consecutive year, in Stack Overflow’s annual Developer Survey as the most-loved programming language. The number of Rust developers is growing as the need for faster and safer software increases. Yet, to our knowledge, no research has sought to identify the most pervasive bug fix patterns within Rust programs. In this project, we introduce Ruxanne, a tool for analyzing and extracting fix patterns in Rust. Ruxanne implements a novel embedding of Rust code into fixed-sized vectors. Using Ruxanne, we mined the top 18 most-starred Rust projects in GitHub to discover the most common bug fix patterns committed to their repositories. We analyzed 87,726 code changes drawn from 57,214 commits across these 18 projects. After clustering the code changes, and conducting a manual analysis, we identified 20 groups of cross-project bug fix patterns, which we categorize as (1) general patterns and (2) borrow-checker-related patterns. Among the general patterns, the most frequently observed pattern is when the user either adds or removes struct fields. In the case of borrow-checker-related patterns, the most common pattern we encountered is when the user removes a clone() call. We describe all detected patterns and their implications to automated program repair.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability Statement

The datasets generated and analyzed during the current study are available in the Zenodo repository, https://zenodo.org/record/8052979.

Notes

  1. At the extreme, a fault could be caused by moths or other hardware malfunctions.

  2. https://crates.io/crates/syn

  3. https://zenodo.org/record/7388618

  4. https://www.dabeaz.com/ply/

  5. https://dictdiffer.readthedocs.io/en/latest/

  6. https://doc.rust-lang.org/reference/items.html

  7. https://github.com/mhtchan/packcircles

  8. https://github.com/swc-project/swc/commit/75a14f98b7370226115ee24eec6eb8c802bd4837

  9. https://github.com/starship/starship/commit/4de9e43cff46c834bd340d24a02fc95d85310a33

  10. https://github.com/rust-lang/rust-clippy

  11. https://github.com/tauri-apps/tauri/commit/a280ee90af0749ce18d6d0b00939b06473717bc9

  12. https://github.com/starship/starship/commit/56d475578ea508631275772127f49a6949fea6b0

  13. https://rust-lang.github.io/rust-clippy/master/index.html#inefficient_to_string

References

  • Alon U, Zilberstein M, Levy O, Yahav E (2018) A general path-based representation for predicting program properties. ACM SIGPLAN Not 53(4):404–419

    Article  Google Scholar 

  • Alon U, Zilberstein M, Levy O, Yahav E (2019a) code2seq: Generating sequences from structured representations of code. In: Proceedings of the 2019 Conference of the Association for Computational Linguistics (ACL). pp 6304–6315

  • Alon U, Zilberstein M, Levy O, Yahav E (2019b) code2vec: Learning distributed representations of code. Proc ACM Program Lang 3(POPL):1–29

  • Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering. pp 1–10

  • Bielik P, Raychev V, Vechev M (2016) PHOG: Probabilistic model for code. In: International conference on machine learning. pp 2933–2942

  • Campos EC, Maia MA (2019) Discovering common bug-fix patterns: A large-scale observational study. J Softw: Evol Process 31(7):1–28

    Google Scholar 

  • Cannon L, Elliott R, Kirchhoff L, Miller J, Milner J, Mitze R, Schan E, Whittington N, Spencer H, Keppel D et al (1991) Recommended C style and coding standards. Pocket reference guide, Specialized Systems Consultants

    Google Scholar 

  • Chen Z, Monperrus M (2019) A literature study of embeddings on source code. arXiv:1904.03061

  • Collins CR, Stephenson K (2003) A circle packing algorithm. Comput Geom 25(3):233–256

    Google Scholar 

  • Cotroneo D, De Simone L, Iannillo A K, Natella R, Rosiello S, Bidokhti N (2019) Analyzing the context of bug-fixing changes in the OpenStack cloud computing platform. In: 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, pp 334–345

  • DeGroot M H, Schervish M J (2012) Probability and statistics. Pearson Education

  • Endres A (1975) An analysis of errors and their causes in system programs. IEEE Trans Softw Eng 1(1):140–149

    Article  Google Scholar 

  • Ester M, Kriegel H-P, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD. pp 226–231

  • Eyolfson J (2018) Enforcing Abstract Immutability. PhD thesis, University of Waterloo

  • Flanagan C, Felleisen M (1998) A new way of debugging Lisp programs. In: Proceedings of Lisp Users’ Group Meeting (LUGM)

  • Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. pp 947–954

  • Gopinath R, Jensen C, Groce A et al (2015) Mutant census: An empirical examination of the competent programmer hypothesis. Technical Report, School of EECS, Oregon State University

  • Hanam Q, Brito FSd M, Mesbah A (2016) Discovering bug patterns in JavaScript. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. pp 144–156

  • Hindle A, Barr ET, Gabel M, Su Z, Devanbu P (2016) On the naturalness of software. Commun ACM 59(5):122–131

    Article  Google Scholar 

  • Hoang T, Kang H J, Lo D, Lawall J (2020) CC2Vec: Distributed representations of code changes. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering. pp 518–529

  • Huang W, Milanova A, Dietl W, Ernst MD (2012) ReIm & ReImInfer: Checking and inference of reference immutability and method purity. OOPSLA 2012, Object-Oriented Programming Systems, Languages, and Applications. Tucson, AZ, USA, pp 879–896

    Google Scholar 

  • Islam MR, Zibran MF (2021) What changes in where? An empirical study of bug-fixing change patterns. ACM SIGAPP Appl Comput Rev 20(4):18–34

    Article  Google Scholar 

  • Jeffrey D, Feng M, Gupta N, Gupta R (2009) Bugfix: A learning-based tool to assist developers in fixing bugs. In: 2009 IEEE 17th international conference on program comprehension. IEEE, pp 70–79

  • Jones J A, Harrold M J (2005) Empirical evaluation of the Tarantula automatic fault-localization technique. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering. pp 273–282

  • Klabnik S, Nichols C (2019) The Rust programming language (Covers Rust 2018). No Starch Press

  • Knuth DE (1989) The errors of TeX. Softw-Pract Exper 19(7):607–685

    Article  Google Scholar 

  • Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: 2012 34th International Conference on Software Engineering (ICSE). IEEE, pp 3–13

  • Le Goues C, Pradel M, Roychoudhury A (2019) Automated program repair. Commun ACM 62(12):56–65

    Article  Google Scholar 

  • Li Z, Wang J, Sun M, Lui J C (2021) MirChecker: Detecting bugs in rust programs via static analysis. In: Proceedings of the 2021 ACM SIGSAC conference on computer and communications security. pp 2183–2196

  • Lin B, Wang S, Wen M, Mao X (2022) Context-aware code change embedding for better patch correctness assessment. ACM Trans Softw Eng Methodol (TOSEM) 31(3):1–29

    Google Scholar 

  • Ling M, Yu Y, Wu H, Wang Y, Cordy J R, Hassan A E (2022) In Rust we trust: a transpiler from unsafe C to safer Rust. In: Proceedings of the ACM/IEEE 44th international conference on software engineering: companion proceedings. pp 354–355

  • Liu Y, Zhang L, Zhang Z (2018) A survey of test based automatic program repair. J. Softw. 13(8):437–452

    Article  Google Scholar 

  • Madeiral F, Durieux T, Sobreira V, Maia M (2018) Towards an automated approach for bug fix pattern detection. arXiv:1807.11286

  • Martinez M, Monperrus M (2012) Mining repair actions for guiding automated program fixing. PhD thesis, Inria

  • Martinez M, Monperrus M (2015) Mining software repair models for reasoning on the search space of automated program fixing. Emp Softw Eng 20(1):176–205

    Article  Google Scholar 

  • Monperrus M (2014) “A critical review of automatic patch generation learned from human-written patches”: Essay on the problem statement and the evaluation of automatic software repair. In: Proceedings of the 36th international conference on software engineering. pp 234–242

  • Moss S (2021) How Dropbox pulled off its hybrid cloud transition. https://www.datacenterdynamics.com/en/analysis/how-dropbox-pulled-off-its-hybrid-cloud-transition/. November 21, 2022

  • Naish L, Lee H J, Ramamohanarao K (2009) Spectral debugging with weights and incremental ranking. In: 2009 16th Asia-pacific software engineering conference. IEEE, pp 168–175

  • Nguyen T, Weimer W, Le Goues C, Forrest S (2009) Using execution paths to evolve software patches. In: 2009 International conference on software testing, verification, and validation workshops. IEEE, pp 152–153

  • Pan K, Kim S, Whitehead EJ (2009) Toward an understanding of bug fix patterns. Emp Softw Eng 14(3):286–315

    Article  Google Scholar 

  • Qi Y, Mao X, Lei Y (2013) Efficient automated program repair through fault-recorded testing prioritization. In 2013 IEEE International Conference on Software Maintenance. IEEE, pp 180–189

  • Qi Y, Mao X, Lei Y, Dai Z, Wang C (2014) The strength of random search on automated program repair. In Proceedings of the 36th International Conference on Software Engineering. pp 254–265

  • Qin B, Chen Y, Yu Z, Song L, Zhang Y (2020) Understanding memory and thread safety practices and issues in real-world Rust programs. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. pp 763–779

  • Raychev V, Bielik P, Vechev M, Krause A (2016) Learning programs from noisy data. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’16. page 761-774, Association for Computing Machinery, New York, NY, USA

  • Sam G, Cameron N, Potanin A (2017) Automated refactoring of Rust programs. In Proceedings of the Australasian Computer Science Week Multiconference. pp 1–9

  • Spadini D, Aniche M, Bacchelli A (2018) Pydriller: Python framework for mining software repositories. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. pp 908–911

  • Tan S H, Roychoudhury A (2015) relifix: Automated repair of software regressions. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering volume 1. IEEE, pp 471–482

  • Tian H, Tang X, Habib A, Wang S, Liu K, Xia X, Klein J, Bissyandé T F (2022) Is this change the answer to that problem? Correlating descriptions of bug and code changes for evaluating patch correctness. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. pp 1–13

  • Wong WE, Gao R, Li Y, Abreu R, Wotawa F (2016) A survey on software fault localization. IEEE Trans Softw Eng 42(8):707–740

    Article  Google Scholar 

  • Xie X, Chen TY, Kuo F-C, Xu B (2013) A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans Softw Eng Methodol (TOSEM) 22(4):1–40

    Article  Google Scholar 

  • Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

  • Yang Y, He T, Feng Y, Liu S, Xu B (2022) Mining Python fix patterns via analyzing fine-grained source code changes. Emp Softw Eng 27(2):1–37

    Google Scholar 

  • Ye H, Gu J, Martinez M, Durieux T, Monperrus M (2021) Automated classification of overfitting patches with statically extracted code features. IEEE Trans Softw Eng 48(8):2920–2938

    Article  Google Scholar 

  • Zhang Y, Chen Y, Cheung S-C, Xiong Y, Zhang L (2018) An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp 129–140

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Robati Shirzad.

Ethics declarations

Competing Interests

We have no competing interests and are funded by a Discovery Grant from Canada’s Natural Science and Engineering Research Council

Additional information

Communicated by: Martin Monperrus.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Robati Shirzad, M., Lam, P. A study of common bug fix patterns in Rust. Empir Software Eng 29, 44 (2024). https://doi.org/10.1007/s10664-023-10437-1

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-023-10437-1

Keywords

Navigation