Abstract
Popular and large contemporary open-source projects now embrace a diverse set of documentation for communication channels. Examples include contribution guidelines (i.e., commit message guidelines, coding rules, submission guidelines), code of conduct (i.e., rules and behavior expectations), governance policies, and Q&A forum. In 2020, GitHub released Discussion to distinguish between communication and collaboration. However, it remains unclear how developers maintain these channels, how trivial it is, and whether deciding on conversion takes time. We conducted an empirical study on 259 NPM and 148 PyPI repositories, devising two taxonomies of reasons for converting discussions into issues and vice-versa. The most frequent conversion from a discussion to an issue is when developers request a contributor to clarify their idea into an issue (Reporting a Clarification Request –35.1% and 34.7%, respectively), while agreeing that having non actionable topic (QA, ideas, feature requests –55.0% and 42.0%, respectively) is the most frequent reason of converting an issue into a discussion. Furthermore, we show that not all reasons for conversion are trivial (e.g., not a bug), and raising a conversion intent potentially takes time (i.e., a median of 15.2 and 35.1 h, respectively, taken from issues to discussions). Our work contributes to complementing the GitHub guidelines and helping developers effectively utilize the Issue and Discussion communication channels to maintain their collaboration.
Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available in the https://github.com/posl/GitHub_Discussion_Conversion
Notes
https://github.blog/2020-05-06-new-from-satellite-2020-github-codespaces-github-discussions-securing-code-in-private-repositories-and-more/
References
Abdalkareem R, Nourry O, Wehaibi S, Mujahid S, Shihab E (2017) Why do developers use trivial packages? an empirical case study on npm. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ACM, ESEC/FSE 2017, p 385–395
Bacchelli A, Bird C (2013) Expectations, Outcomes, and Challenges of Modern Code Review. In: Proceedings of the 35th International Conference on Software Engineering, pp 712–721
Bangash AA, Sahar H, Chowdhury S, Wong AW, Hindle A, Ali K (2019) What do developers know about machine learning: a study of ml discussions on stackoverflow. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), IEEE, pp 260–264
Chinthanet B, Kula RG, McIntosh S, Ishio T, Ihara A, Matsumoto K (2021) Lags in the release, adoption, and propagation of npm vulnerability fixes. Empir Softw Eng 26:1–28
Chouchen M, Ouni A, Kula RG, Wang D, Thongtanunam P, Mkaouer MW, Matsumoto K (2021) Anti-patterns in modern code review: Symptoms and prevalence. In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER), IEEE, pp 531–535
Cogo FR, Oliva GA, Hassan AE (2019) An empirical study of dependency downgrades in the npm ecosystem. IEEE Trans Softw Eng 47(11):2457–2470
Decan A, Mens T, Claes M (2016) On the topology of package dependency networks: A comparison of three programming language ecosystems. In: Proccedings of the 10th European Conference on Software Architecture Workshops, pp 1–4
Ebert F, Castor F, Novielli N, Serebrenik A (2019) Confusion in code reviews: Reasons, impacts, and coping strategies. In: 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 49–60
Hata H, Treude C, Kula RG, Ishio T (2019) 9.6 Million Links in Source Code Comments: Purpose, Evolution, and Decay. In: Proceedings of the 41st International Conference on Software Engineering, pp 1211–1221
Hata H, Novielli N, Baltes S, Kula RG, Treude C (2022) Github discussions: An exploratory study of early adoption. Empir Softw Eng 27:3
Hecke TV (2012) Power study of anova versus kruskal-wallis test. J Stat Manag Syst 15(2–3):241–247
Hindle A, Alipour A, Stroulia E (2016) A contextual approach towards more accurate duplicate bug report detection and ranking. Empir Softw Eng 21(2):368–410
Hirao T, McIntosh S, Ihara A, Matsumoto K (2019) The Review Linkage Graph for Code Review Analytics: A Recovery Approach and Empirical Study. In: Proc. of the International Symposium on the Foundations of Software Engineering (FSE), p 578–589
Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47(260):583–621
Kula RG, Robles G (2019) The life and death of software ecosystems. In: Towards Engineering Free/Libre Open Source Software (FLOSS) Ecosystems for Impact and Sustainability: Communications of NII Shonan Meetings. Springer, pp 97–105
Lee A, Carver JC, Bosu A (2017) Understanding the impressions, motivations, and barriers of one time code contributors to floss projects: a survey. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), IEEE, pp 187–197
Li Z, Yin G, Yu Y, Wang T, Wang H (2017) Detecting duplicate pull-requests in github. In: Proceedings of the 9th Asia-Pacific Symposium on Internetware, pp 1–6
Lima M, Steinmacher I, Ford D, Liu E, Vorreuter G, Conte T, Gadelha B (2022) Looking for related discussions on github discussions. arXiv preprint arXiv:220611971
Liu X, Zhong H (2018) Mining stackoverflow for program repair. In: 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER), IEEE, pp 118–129
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18:50–60
McHugh ML (2012) Interrater reliability: the kappa statistic. Biochemia Medica 22(3):276–282
Mendez C, Padala HS, Steine-Hanson Z, Hilderbrand C, Horvath A, Hill C, Simpson L, Patil N, Sarma A, Burnett M (2018) Open source barriers to entry, revisited: A sociotechnical perspective. In: Proceedings of the 40th International conference on software engineering, pp 1004–1015
Nguyen AT, Nguyen TT, Nguyen TN, Lo D, Sun C (2012) Duplicate bug report detection with a combination of information retrieval and topic modeling. In: 2012 Proceedings of the 27th IEEE/ACM international conference on automated software engineering, IEEE, pp 70–79
Parra E, Alahmadi M, Ellis A, Haiduc S (2022) A comparative study and analysis of developer communications on slack and gitter. Empir Softw Eng 27(2):1–33
Pascarella L, Spadini D, Palomba F, Bruntink M, Bacchelli A (2018) Information Needs in Contemporary Code Review. Proc ACM Conf Comput Supported Coop Work 2:135:1-135:27
Raglianti M, Nagy C, Minelli R, Lanza M (2022) DiscOrDance: visualizing software developers communities on discord. In: 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), Limassol, pp 474–478. https://doi.org/10.1109/ICSME55016.2022.00062
Rehman I, Wang D, Kula RG, Ishio T, Matsumoto K (2022) Newcomer oss-candidates: Characterizing contributions of novice developers to github. Empir Softw Eng 27(5):1–20
Steinmacher I, Gerosa MA, Redmiles D (2014) Attracting, onboarding, and retaining newcomer developers in open source software projects. In: Workshop on Global Software Development in a CSCW Perspective, vol 16, p 20
Steinmacher I, Treude C, Gerosa MA (2018) Let me in: Guidelines for the successful onboarding of newcomers to open source projects. IEEE Softw 36(4):41–49
Stemler S (2000) An overview of content analysis. Pract Assess Res Eval 7(1):17
Storey MA, Zagalsky A, Figueira Filho F, Singer L, German DM (2016) How social and communication channels shape and challenge a participatory culture in software development. IEEE Trans Software Eng 43(2):185–204
Stray V, Moe NB (2020) Understanding coordination in global software engineering: A mixed-methods study on the use of meetings and slack. J Syst Softw 170:110717
Tan X, Zhou M (2019) How to communicate when submitting patches: An empirical study of the linux kernel. Proc ACM Hum-Comput Interact 3(CSCW):1–26
Tan X, Zhou M, Sun Z (2020) A first look at good first issues on GitHub, Association for Computing Machinery, New York, NY, USA, p 398-409. https://doi.org/10.1145/3368089.3409746
Tantisuwankul J, Nugroho YS, Kula RG, Hata H, Rungsawang A, Leelaprute P, Matsumoto K (2019) A topological analysis of communication channels for knowledge sharing in contemporary github projects. J Syst Softw 158:110416
Treude C, Robillard MP (2017) Understanding stack overflow code fragments. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 509–513
Treude C, Barzilay O, Storey MA (2011) How do programmers ask and answer questions on the web? (nier track). In: Proceedings of the 33rd International Conference on Software Engineering, Association for Computing Machinery, New York, NY, USA, ICSE ’11, p 804–807
Vale G, Schmid A, Santos AR, De Almeida ES, Apel S (2020) On the relation between github communication activity and merge conflicts. Empir Softw Eng 25(1):402–433
Vasilescu B, Capiluppi A, Serebrenik A (2012) Gender, representation and online participation: A quantitative study of stackoverflow. In: 2012 International Conference on Social Informatics, IEEE, pp 332–338
Wan Z, Xia X, Hassan AE (2021) What do programmers discuss about blockchain? a case study on the use of balanced lda and the reference architecture of a domain to capture online discussions about blockchain platforms across stack exchange communities. IEEE Trans Softw Eng 47:(7)1331–1349
Wang D, Kula RG, Ishio T, Matsumoto K (2021a) Automatic patch linkage detection in code review using textual content and file location features. Inf Softw Technol 139:106637
Wang D, Ueda Y, Kula RG, Ishio T, Matsumoto K (2021b) Can we benchmark code review studies? a systematic mapping study of methodology, dataset, and metric. J Syst Softw 180:111009
Wang D, Xiao T, Thongtanunam P, Kula RG, Matsumoto K (2021c) Understanding shared links and their intentions to meet information needs in modern code review. Empir Softw Eng 26(5):96
Wang D, Xiao T, Treude C, Kula RG, Hata H, Kamei Y (2023) Understanding the role of images on stack overflow. arXiv preprint arXiv:230315684
Wang Q, Xu B, Xia X, Wang T, Li S (2019) Duplicate pull request detection: When time matters. In: Proceedings of the 11th Asia-Pacific Symposium on Internetware, pp 1–10
Xiao W, He H, Xu W, Tan X, Dong J, Zhou M (2022) Recommending good first issues in github oss projects. In: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), IEEE, pp 1830–1842
Acknowledgements
This work is supported by Japanese Society for the Promotion of Science (JSPS) KAKENHI grants (JP20K19774, JP20H05706, JP22K17874, JP21H04877, JP23K16864), and JSPS and SNSF for the project “SENSOR” (JPJSJRP20191502).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that Raula Gaikovina Kula and Yasutaka Kamei are members of the EMSE Editorial Board. All co-authors have seen and agreed with the contents of the manuscript and there is no financial interest to report.
Additional information
Communicated by: Jeffrey C. Carver
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, D., Kondo, M., Kamei, Y. et al. When conversations turn into work: a taxonomy of converted discussions and issues in GitHub. Empir Software Eng 28, 138 (2023). https://doi.org/10.1007/s10664-023-10366-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10366-z