Abstract
Understanding and recognising the difference between novice and expert programmers could be beneficial in a wide range of scenarios, such as to screen programming job applicants. In this paper, we explore the identification of code author attributes to enable novice/expert differentiation via machine learning models. Our iteratively developed model is based on data from HackerRank, a competitive programming website. Multiple experiments were carried using 10-fold cross-validation. Our final model performed well by differentiating novice coders from expert coders with 71.3% accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Reference
Agrawal, R., Golshan, B., Terzi, E.: Grouping students in educational settings. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining -KDD 2014 (2014). https://doi.org/10.1145/2623330.2623748
Bhattathiripad, P.V.: Software piracy forensics: a proposal for incorporating dead codes and other programming blunders as important evidence in afc test. In: 2012 IEEE 36th Annual Computer Software and Applications Conference Workshops (July 2012). https://doi.org/10.1109/compsacw.2012.46
Burrows, S., Tahaghoghi, S.: Source code authorship attribution using n-grams (January 2007)
Castellanos, H., Restrepo-Calle, F., Gonzalez, F.A., Echeverry, J.J.R.: Understanding the relationships between self-regulated learning and students source code in a computer programming course. In: 2017 IEEE Frontiers in Education Conference (FIE) (October 2017). https://doi.org/10.1109/fie.2017.8190467
Clark, J.G., Walz, D.B., Wynekoop, J.L.: Identifying exceptional application software developers: a comparison of students and professionals. Commun. Assoc. Inf. Syst. 11, 8 (2003). https://doi.org/10.17705/1cais.01108
Halstead, M.H.: Elements of Software Science (Operating and programming systems series). Elsevier Science Inc. (May 1977)
Kalgutkar, V., Kaur, R., Gonzalez, H., Stakhanova, N., Matyukhina, A.: Code authorship attribution. ACM Comput. Surv. 52, 1–36 (2019). https://doi.org/10.1145/3292577
Lui, A.K., Kwan, R., Poon, M., Cheung, Y.H.Y.: Saving weak programming students. ACM SIGCSE Bull. 36, 72 (2004). https://doi.org/10.1145/1024338.1024376
Oman, P.W., Cook, C.R.: Programming style authorship analysis. In: Proceedings of the Seventeenth Annual ACM Conference on Computer Science : Computing trends in the 1990’s Computing trends in the 1990’s - CSC 1989 (1989). https://doi.org/10.1145/75427.75469
Spafford, E.H., Weeber, S.A.: Software forensics: can we track code to its authors? Comput. Secur. 12, 585–595 (1993). https://doi.org/10.1016/0167-4048(93)90055-a
Wisse, W., Veenman, C.: Scripting dna: identifying the javascript programmer. Digit. Investig. 15, 61–71 (2015). https://doi.org/10.1016/j.diin.2015.09.001
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Lee, C.H., Hall, T. (2021). Using Machine Learning to Recognise Novice and Expert Programmers. In: Ardito, L., Jedlitschka, A., Morisio, M., Torchiano, M. (eds) Product-Focused Software Process Improvement. PROFES 2021. Lecture Notes in Computer Science(), vol 13126. Springer, Cham. https://doi.org/10.1007/978-3-030-91452-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-91452-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91451-6
Online ISBN: 978-3-030-91452-3
eBook Packages: Computer ScienceComputer Science (R0)