{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,25]],"date-time":"2024-09-25T11:27:46Z","timestamp":1727263666083},"reference-count":95,"publisher":"Springer Science and Business Media LLC","license":[{"start":{"date-parts":[[2022,1,18]],"date-time":"2022-01-18T00:00:00Z","timestamp":1642464000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,1,18]],"date-time":"2022-01-18T00:00:00Z","timestamp":1642464000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100009077","name":"itea3","doi-asserted-by":"publisher","award":["17039"],"id":[{"id":"10.13039\/501100009077","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Requirements Eng"],"abstract":"Abstract<\/jats:title>Recommender systems for requirements are typically built on the assumption that similar requirements can be used as proxies to retrieve similar software. When a stakeholder proposes a new requirement, natural language processing (NLP)-based similarity metrics can be exploited to retrieve existing requirements, and in turn, identify previously developed code. Several NLP approaches for similarity computation between requirements are available. However, there is little empirical evidence on their effectiveness for code retrieval. This study compares different NLP approaches, from lexical ones to semantic, deep-learning techniques, and correlates the similarity among requirements with the similarity of their associated software. The evaluation is conducted on real-world requirements from two industrial projects from a railway company. Specifically, the most similar pairs of requirements across two industrial projects are automatically identified using six language models. Then, the trace links between requirements and software are used to identify the software pairs associated with each requirements pair. The software similarity between pairs is then automatically computed with JPLag. Finally, the correlation between requirements similarity and software similarity is evaluated to see which language model shows the highest correlation and is thus more appropriate for code retrieval. In addition, we perform a focus group with members of the company to collect qualitative data. Results show a moderately positive correlation between requirements similarity and software similarity, with the pre-trained deep learning-based BERT language model with preprocessing outperforming the other models. Practitioners confirm that requirements similarity is generally regarded as a proxy for software similarity. However, they also highlight that additional aspect comes into play when deciding software reuse, e.g., domain\/project knowledge, information coming from test cases, and trace links. Our work is among the first ones to explore the relationship between requirements and software similarity from a quantitative and qualitative standpoint. This can be useful not only in recommender systems but also in other requirements engineering tasks in which similarity computation is relevant, such as tracing and change impact analysis.<\/jats:p>","DOI":"10.1007\/s00766-021-00370-4","type":"journal-article","created":{"date-parts":[[2022,1,18]],"date-time":"2022-01-18T02:07:31Z","timestamp":1642471651000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["On the relationship between similar requirements and similar software"],"prefix":"10.1007","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-6418-9971","authenticated-orcid":false,"given":"Muhammad","family":"Abbas","sequence":"first","affiliation":[]},{"given":"Alessio","family":"Ferrari","sequence":"additional","affiliation":[]},{"given":"Anas","family":"Shatnawi","sequence":"additional","affiliation":[]},{"given":"Eduard","family":"Enoiu","sequence":"additional","affiliation":[]},{"given":"Mehrdad","family":"Saadatmand","sequence":"additional","affiliation":[]},{"given":"Daniel","family":"Sundmark","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,1,18]]},"reference":[{"key":"370_CR1","doi-asserted-by":"crossref","unstructured":"Abbas M, Ferrari A, Shatnawi A, Enoiu EP, Saadatmand M (2021) Is requirements similarity a good proxy for software similarity? an empirical investigation in industry. In: The 27th international working conference on requirements engineering: foundation for Software Quality, pp. 3\u201318. Springer International Publishing","DOI":"10.1007\/978-3-030-73128-1_1"},{"key":"370_CR2","doi-asserted-by":"crossref","unstructured":"Abbas M, Jongeling R, Lindskog C, Enoiu EP, Saadatmand M, Sundmark D (2020) Product line adoption in industry: An experience report from the railway domain. In: Proceedings of the 24th ACM Conference on Systems and Software Product Line: Volume A - Volume A, SPLC \u201920. ACM, New York, NY, USA","DOI":"10.1145\/3382025.3414953"},{"key":"370_CR3","doi-asserted-by":"crossref","unstructured":"Abbas M, Saadatmand M, Enoiu E, Sundamark D, Lindskog C (2020) Automated reuse recommendation of product line assets based on natural language requirements. In: S.\u00a0Ben\u00a0Sassi, S.\u00a0Ducasse, H.\u00a0Mili (eds.) Reuse in Emerging Software Engineering Practices, pp. 173\u2013189. Springer International Publishing, Cham","DOI":"10.1007\/978-3-030-64694-3_11"},{"issue":"6","key":"370_CR4","doi-asserted-by":"publisher","first-page":"5454","DOI":"10.1007\/s10664-020-09864-1","volume":"25","author":"S Abualhaija","year":"2020","unstructured":"Abualhaija S, Arora C, Sabetzadeh M, Briand LC, Traynor M (2020) Automated demarcation of requirements in textual specifications: a machine learning-based approach. Emp Softw Eng 25(6):5454\u20135497","journal-title":"Emp Softw Eng"},{"issue":"5","key":"370_CR5","doi-asserted-by":"publisher","first-page":"725","DOI":"10.1109\/TSE.2012.71","volume":"39","author":"N Ali","year":"2012","unstructured":"Ali N, Gu\u00e9h\u00e9neuc YG, Antoniol G (2012) Trustrace: mining software repositories to improve the accuracy of requirement traceability links. IEEE Trans Softw Eng 39(5):725\u2013741","journal-title":"IEEE Trans Softw Eng"},{"issue":"10","key":"370_CR6","doi-asserted-by":"publisher","first-page":"944","DOI":"10.1109\/TSE.2015.2428709","volume":"41","author":"C Arora","year":"2015","unstructured":"Arora C, Sabetzadeh M, Briand L, Zimmer F (2015) Automated checking of conformance to requirements templates using natural language processing. IEEE Trans Softw Eng 41(10):944\u2013968","journal-title":"IEEE Trans Softw Eng"},{"issue":"10","key":"370_CR7","doi-asserted-by":"publisher","first-page":"918","DOI":"10.1109\/TSE.2016.2635134","volume":"43","author":"C Arora","year":"2016","unstructured":"Arora C, Sabetzadeh M, Briand L, Zimmer F (2016) Automated extraction and clustering of requirements glossary terms. Trans Softw Eng 43(10):918\u2013945","journal-title":"Trans Softw Eng"},{"key":"370_CR8","doi-asserted-by":"publisher","unstructured":"Arora C, Sabetzadeh M, Briand L, Zimmer F (2016) Extracting domain models from natural-language requirements: Approach and industrial evaluation. In: Proceedings of the ACM\/IEEE 19th International Conference on Model Driven Engineering Languages and Systems, p. 250\u2013260. ACM, New York, NY, USA. https:\/\/doi.org\/10.1145\/2976767.2976769","DOI":"10.1145\/2976767.2976769"},{"key":"370_CR9","doi-asserted-by":"crossref","unstructured":"Arora C, Sabetzadeh M, Goknil A, Briand LC, Zimmer F (2015) Change impact analysis for natural language requirements: An nlp approach. In: International Requirements Engineering Conference (RE), pp. 6\u201315. IEEE","DOI":"10.1109\/RE.2015.7320403"},{"key":"370_CR10","doi-asserted-by":"crossref","unstructured":"Aung TWW, Huo H, Sui Y (2020) A literature review of automatic traceability links recovery for software change impact analysis. In: Proceedings of the 28th International Conference on Program Comprehension, pp. 14\u201324","DOI":"10.1145\/3387904.3389251"},{"key":"370_CR11","doi-asserted-by":"publisher","first-page":"132","DOI":"10.1016\/j.jss.2015.05.006","volume":"106","author":"NH Bakar","year":"2015","unstructured":"Bakar NH, Kasirun ZM, Salleh N (2015) Feature extraction approaches from natural language requirements for reuse in software product lines: A systematic literature review. J Syst Softw 106:132\u2013149","journal-title":"J Syst Softw"},{"key":"370_CR12","first-page":"1137","volume":"3","author":"Y Bengio","year":"2003","unstructured":"Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137\u20131155","journal-title":"J Mach Learn Res"},{"key":"370_CR13","doi-asserted-by":"publisher","unstructured":"Bohner: Impact analysis in the software change process: a year 2000 perspective. In: 1996 Proceedings of International Conference on Software Maintenance, pp. 42\u201351 (1996). https:\/\/doi.org\/10.1109\/ICSM.1996.564987","DOI":"10.1109\/ICSM.1996.564987"},{"key":"370_CR14","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1162\/tacl_a_00051","volume":"5","author":"P Bojanowski","year":"2017","unstructured":"Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135\u2013146","journal-title":"Trans Assoc Comput Linguist"},{"issue":"6","key":"370_CR15","doi-asserted-by":"publisher","first-page":"1565","DOI":"10.1007\/s10664-013-9255-y","volume":"19","author":"M Borg","year":"2014","unstructured":"Borg M, Runeson P, Ard\u00f6 A (2014) Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability. Emp Softw Eng 19(6):1565\u20131616. https:\/\/doi.org\/10.1007\/s10664-013-9255-y","journal-title":"Emp Softw Eng"},{"issue":"7","key":"370_CR16","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1109\/TSE.2016.2620458","volume":"43","author":"M Borg","year":"2016","unstructured":"Borg M, Wnuk K, Regnell B, Runeson P (2016) Supporting change impact analysis using a recommendation system: an industrial case study in a safety-critical context. IEEE Trans Softw Eng 43(7):675\u2013700","journal-title":"IEEE Trans Softw Eng"},{"issue":"2","key":"370_CR17","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1191\/1478088706qp063oa","volume":"3","author":"V Braun","year":"2006","unstructured":"Braun V, Clarke V (2006) Using thematic analysis in psychology. Qualit Res Psychol 3(2):77\u2013101. https:\/\/doi.org\/10.1191\/1478088706qp063oa","journal-title":"Qualit Res Psychol"},{"issue":"3","key":"370_CR18","doi-asserted-by":"publisher","first-page":"463","DOI":"10.1080\/03098260600927575","volume":"30","author":"RL Breen","year":"2006","unstructured":"Breen RL (2006) A practical guide to focus-group research. J Geogr Higher Edu 30(3):463\u2013475","journal-title":"J Geogr Higher Edu"},{"key":"370_CR19","doi-asserted-by":"crossref","unstructured":"Castro-Herrera C, Cleland-Huang J, Mobasher B (2009) Enhancing stakeholder profiles to improve recommendations in online requirements elicitation. In: International Requirements Engineering Conference, pp. 37\u201346. IEEE","DOI":"10.1109\/RE.2009.20"},{"key":"370_CR20","doi-asserted-by":"crossref","unstructured":"Cer D, Yang Y, Kong Sy, Hua N, Limtiaco N, John RS, Constant N, Guajardo-C \u0301espedes M, Yuan S, Tar C, et al (2018) Universal sentence encoder. Preprint. https:\/\/arxiv.org\/abs\/1803.11175","DOI":"10.18653\/v1\/D18-2029"},{"key":"370_CR21","unstructured":"Chen L, Ali Babar M, Ali N (2009) Variability management in software product lines: A systematic review. In: Proceedings of the 13th International Software Product Line Conference, SPLC \u201909, Carnegie Mellon University, USA, pp. 81\u201390"},{"key":"370_CR22","doi-asserted-by":"crossref","unstructured":"Cleland-Huang J, Gotel OC, Huffman\u00a0Hayes J, M\u00e4der P, Zisman A (2014) Software traceability: trends and future directions. In: Future of Software Engineering Proceedings, pp. 55\u201369","DOI":"10.1145\/2593882.2593891"},{"issue":"1","key":"370_CR23","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1109\/MS.2005.1","volume":"22","author":"J Natt\u00a0och Dag","year":"2005","unstructured":"Natt\u00a0och Dag J, Regnell B, Gervasi V, Brinkkemper S (2005) A linguistic-engineering approach to large-scale requirements management. IEEE Softw 22(1):32\u201339","journal-title":"IEEE Softw"},{"key":"370_CR24","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.infsof.2018.12.007","volume":"110","author":"F Dalpiaz","year":"2019","unstructured":"Dalpiaz F, Van Der Schalk I, Brinkkemper S, Aydemir FB, Lucassen G (2019) Detecting terminological ambiguity in user stories: tool and experimentation. Inform Softw Technol 110:3\u201316","journal-title":"Inform Softw Technol"},{"key":"370_CR25","doi-asserted-by":"crossref","unstructured":"Davey J, Burd E (2000) Evaluating the suitability of data clustering for software remodularisation. In: Proceedings Seventh Working Conference on Reverse Engineering pp. 268\u2013276","DOI":"10.1109\/WCRE.2000.891478"},{"issue":"6","key":"370_CR26","doi-asserted-by":"publisher","first-page":"391","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9","volume":"41","author":"S Deerwester","year":"1990","unstructured":"Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391\u2013407","journal-title":"J Am Soc Inf Sci"},{"key":"370_CR27","doi-asserted-by":"crossref","unstructured":"Deshpande G, Arora C, Ruhe G (2019) Data-driven elicitation and optimization of dependencies between requirements. In: 2019 IEEE 27th International Requirements Engineering Conference (RE), pp. 416\u2013421. IEEE","DOI":"10.1109\/RE.2019.00055"},{"key":"370_CR28","doi-asserted-by":"crossref","unstructured":"Deshpande G, Motger Q, Palomares C, Kamra I, Biesialska K, Franch X, Ruhe G, Ho J (2020) Requirements dependency extraction by integrating active learning with ontology-based retrieval. In: 2020 IEEE 28th International Requirements Engineering Conference (RE), pp. 78\u201389. IEEE","DOI":"10.1109\/RE48521.2020.00020"},{"key":"370_CR29","doi-asserted-by":"publisher","unstructured":"Devine P, Koh YS, Blincoe K (2021) Evaluating unsupervised text embeddings on software user feedback. In: 2021 IEEE 29th International Requirements Engineering Conference Workshops (REW), pp. 87\u201395. https:\/\/doi.org\/10.1109\/REW53955.2021.00020","DOI":"10.1109\/REW53955.2021.00020"},{"key":"370_CR30","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint https:\/\/arxiv.org\/abs\/1810.04805"},{"key":"370_CR31","doi-asserted-by":"crossref","unstructured":"Dumitru H, Gibiec M, Hariri N, Cleland-Huang J, Mobasher B, Castro-Herrera C, Mirakhorli M (2011) On-demand feature recommendations derived from mining public product descriptions. In: International Conference on Software Engineering, pp. 181\u2013190","DOI":"10.1145\/1985793.1985819"},{"key":"370_CR32","doi-asserted-by":"crossref","unstructured":"Eyal-Salman H, Seriai AD, Dony C (2013) Feature-to-code traceability in a collection of software variants: Combining formal concept analysis and information retrieval. In: 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI), pp. 209\u2013216","DOI":"10.1109\/IRI.2013.6642474"},{"issue":"1","key":"370_CR33","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1109\/TSE.2011.122","volume":"39","author":"D Falessi","year":"2011","unstructured":"Falessi D, Cantone G, Canfora G (2011) Empirical principles and an industrial case study in retrieving equivalent requirements via natural language processing techniques. Trans Softw Eng 39(1):18\u201344","journal-title":"Trans Softw Eng"},{"key":"370_CR34","unstructured":"Felfernig A, Falkner A, Atas M, Franch X, Palomares C (2017) OpenReq: Recommender systems in requirements engineering. In: RS-BDA, pp. 1\u20134"},{"key":"370_CR35","doi-asserted-by":"publisher","unstructured":"Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, Zhou M (2020) CodeBERT: A pre-trained model for programming and natural languages. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1536\u20131547. Association for Computational Linguistics, Online. https:\/\/doi.org\/10.18653\/v1\/2020.findings-emnlp.139. https:\/\/aclanthology.org\/2020.findings-emnlp.139","DOI":"10.18653\/v1\/2020.findings-emnlp.139"},{"issue":"5","key":"370_CR36","doi-asserted-by":"publisher","first-page":"2298","DOI":"10.1007\/s10664-016-9451-7","volume":"22","author":"DM Fern\u00e1ndez","year":"2017","unstructured":"Fern\u00e1ndez DM, Wagner S, Kalinowski M, Felderer M, Mafra P, Vetr\u00f2 A, Conte T, Christiansson MT, Greer D, Lassenius C et al (2017) Naming the pain in requirements engineering. Empir Softw Eng 22(5):2298\u20132338","journal-title":"Empir Softw Eng"},{"issue":"06","key":"370_CR37","first-page":"28","volume":"34","author":"A Ferrari","year":"2017","unstructured":"Ferrari A, Dell\u2019Orletta F, Esuli A, Gervasi V, Gnesi S (2017) Natural language requirements processing: a 4d vision. IEEE Ann History Comput 34(06):28\u201335","journal-title":"IEEE Ann History Comput"},{"issue":"3","key":"370_CR38","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1109\/MS.2013.44","volume":"30","author":"A Ferrari","year":"2013","unstructured":"Ferrari A, Fantechi A, Gnesi S, Magnani G (2013) Model-based development and formal methods in the railway industry. IEEE Softw 30(3):28\u201334","journal-title":"IEEE Softw"},{"issue":"7","key":"370_CR39","doi-asserted-by":"publisher","first-page":"828","DOI":"10.1016\/j.scico.2012.04.003","volume":"78","author":"A Ferrari","year":"2013","unstructured":"Ferrari A, Fantechi A, Magnani G, Grasso D, Tempestini M (2013) The metr\u00f4 rio case study. Sci Comp Program 78(7):828\u2013842","journal-title":"Sci Comp Program"},{"key":"370_CR40","doi-asserted-by":"crossref","unstructured":"Ferrari A, Spagnolo GO, Dell\u2019Orletta F (2013) Mining commonalities and variabilities from natural language documents. In: Proceedings of the 17th International Software Product Line Conference, pp. 116\u2013120","DOI":"10.1145\/2491627.2491634"},{"key":"370_CR41","doi-asserted-by":"publisher","unstructured":"Ferrari A, Spagnolo GO, Gnesi S (2017) Pure: A dataset of public requirements documents. In: 2017 IEEE 25th International Requirements Engineering Conference (RE), pp. 502\u2013505. https:\/\/doi.org\/10.1109\/RE.2017.29","DOI":"10.1109\/RE.2017.29"},{"key":"370_CR42","doi-asserted-by":"crossref","unstructured":"Gervasi V, Zowghi D (2014) Supporting traceability through affinity mining. In: International Requirements Engineering Conference (RE), pp. 143\u2013152. IEEE","DOI":"10.1109\/RE.2014.6912256"},{"key":"370_CR43","doi-asserted-by":"crossref","unstructured":"Gethers M, Dit B, Kagdi H, Poshyvanyk D (2012) Integrated impact analysis for managing software changes. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 430\u2013440. IEEE","DOI":"10.1109\/ICSE.2012.6227172"},{"key":"370_CR44","doi-asserted-by":"crossref","unstructured":"Guo H, Singh MP (2020) Caspar: Extracting and synthesizing user stories of problems from app reviews. In: 2020 IEEE\/ACM 42nd International Conference on Software Engineering (ICSE), pp. 628\u2013640","DOI":"10.1145\/3377811.3380924"},{"key":"370_CR45","doi-asserted-by":"crossref","unstructured":"Guo J, Cheng J, Cleland-Huang J (2017) Semantically enhanced software traceability using deep learning techniques. In: International Conference on Software Engineering (ICSE), pp. 3\u201314. IEEE","DOI":"10.1109\/ICSE.2017.9"},{"key":"370_CR46","doi-asserted-by":"crossref","unstructured":"Hariri N, Castro-Herrera C, Cleland-Huang J, Mobasher B (2014) Recommendation systems in requirements discovery. In: Recommendation Systems in Software Engineering, pp. 455\u2013476. Springer","DOI":"10.1007\/978-3-642-45135-5_17"},{"key":"370_CR47","volume-title":"Lucene in action","author":"E Hatcher","year":"2005","unstructured":"Hatcher E, Gospodneti\u0107 O, McCandless M (2005) Lucene in action. Manning Greenwich, NY"},{"key":"370_CR48","doi-asserted-by":"crossref","unstructured":"Hey T, Keim J, Koziolek A, Tichy WF (2020) Norbert: Transfer learning for requirements classification. In: 2020 IEEE 28th International Requirements Engineering Conference (RE), pp. 169\u2013179. IEEE","DOI":"10.1109\/RE48521.2020.00028"},{"key":"370_CR49","volume-title":"Applied statistics for the behavioral sciences","author":"DE Hinkle","year":"2003","unstructured":"Hinkle DE, Wiersma W, Jurs SG (2003) Applied statistics for the behavioral sciences, vol 663. Houghton Mifflin College Division, NY"},{"key":"370_CR50","first-page":"223","volume":"93","author":"M Irshad","year":"2018","unstructured":"Irshad M, Petersen K, Poulding S (2018) A systematic literature review of software requirements reuse approaches. IST J 93:223\u2013245","journal-title":"IST J"},{"key":"370_CR51","doi-asserted-by":"crossref","unstructured":"Ito K, Ishio T, Inoue K (2017) Web-service for finding cloned files using b-bit minwise hashing. In: 2017 IEEE 11th International Workshop on Software Clones (IWSC), pp. 1\u20132. IEEE","DOI":"10.1109\/IWSC.2017.7880504"},{"issue":"11","key":"370_CR52","doi-asserted-by":"publisher","first-page":"15169","DOI":"10.1007\/s11042-018-6894-4","volume":"78","author":"H Jelodar","year":"2019","unstructured":"Jelodar H, Wang Y, Yuan C, Feng X, Jiang X, Li Y, Zhao L (2019) Latent dirichlet allocation (lda) and topic modeling: models, applications, a survey. Multimedia Tools Appl 78(11):15169\u201315211","journal-title":"Multimedia Tools Appl"},{"issue":"2065","key":"370_CR53","doi-asserted-by":"publisher","first-page":"20150202","DOI":"10.1098\/rsta.2015.0202","volume":"374","author":"IT Jolliffe","year":"2016","unstructured":"Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans R Soc A Math Phys Eng Sci 374(2065):20150202","journal-title":"Philos Trans R Soc A Math Phys Eng Sci"},{"issue":"4","key":"370_CR54","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1007\/s11334-014-0232-4","volume":"10","author":"M Kassab","year":"2014","unstructured":"Kassab M, Neill C, Laplante P (2014) State of practice in requirements engineering: contemporary data. Innov Syst Softw Eng 10(4):235\u2013241","journal-title":"Innov Syst Softw Eng"},{"key":"370_CR55","doi-asserted-by":"crossref","unstructured":"Kontio J, Lehtola L, Bragge J (2004) Using the focus group method in software engineering: obtaining practitioner and user experiences. In: Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE\u201904., pp. 271\u2013280. IEEE","DOI":"10.1109\/ISESE.2004.1334914"},{"key":"370_CR56","doi-asserted-by":"crossref","unstructured":"Krueger C (2001) Easing the transition to software mass customization. In: International Workshop on Software Product-Family Engineering, pp. 282\u2013293. Springer","DOI":"10.1007\/3-540-47833-7_25"},{"key":"370_CR57","doi-asserted-by":"crossref","unstructured":"Kurtanovi\u0107 Z, Maalej W (2017) Automatically classifying functional and non-functional requirements using supervised machine learning. In: 2017 IEEE 25th International Requirements Engineering Conference (RE), pp. 490\u2013495. Ieee","DOI":"10.1109\/RE.2017.82"},{"key":"370_CR58","unstructured":"Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp. 1188\u20131196"},{"key":"370_CR59","doi-asserted-by":"crossref","unstructured":"Lin J, Liu Y, Zeng Q, Jiang M, Cleland-Huang J (2021) Traceability transformed: Generating more accurate links with pre-trained bert models. In: ICSE 2021, to appear. arXiv:2102.04411v2","DOI":"10.1109\/ICSE43902.2021.00040"},{"key":"370_CR60","doi-asserted-by":"crossref","unstructured":"Lops P, De\u00a0Gemmis M, Semeraro G (2011) Content-based recommender systems: state of the art and trends. In: Recommender systems handbook, pp. 73\u2013105. Springer","DOI":"10.1007\/978-0-387-85820-3_3"},{"key":"370_CR61","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511809071","volume-title":"Introduction to information retrieval","author":"CD Manning","year":"2008","unstructured":"Manning CD, Sch\u00fctze H, Raghavan P (2008) Introduction to information retrieval. Cambridge University Press, Cambrigde"},{"key":"370_CR62","doi-asserted-by":"crossref","unstructured":"Matinnejad R, Nejati S, Briand LC, Bruckmann T (2016) Automated test suite generation for time-continuous simulink models. In: proceedings of the 38th International Conference on Software Engineering, pp. 595\u2013606","DOI":"10.1145\/2884781.2884797"},{"key":"370_CR63","doi-asserted-by":"crossref","unstructured":"Mavin A, Wilkinson P, Harwood A, Novak M (2009) Easy approach to requirements syntax (ears). In: 2009 17th IEEE International Requirements Engineering Conference, pp. 317\u2013322. IEEE","DOI":"10.1109\/RE.2009.9"},{"key":"370_CR64","unstructured":"Mikolov T, Chen K, Corrado G, Dean J (2013)\u00a0Efficient estimation of word representations in vector space. Preprint. https:\/\/arxiv.org\/abs\/1301.3781"},{"issue":"3","key":"370_CR65","doi-asserted-by":"publisher","first-page":"627","DOI":"10.1007\/s10515-018-0238-5","volume":"25","author":"K Narasimhan","year":"2018","unstructured":"Narasimhan K, Reichenbach C, Lawall J (2018) Cleaning up copy-paste clones with interactive merging. Autom Softw Eng 25(3):627\u2013673","journal-title":"Autom Softw Eng"},{"key":"370_CR66","doi-asserted-by":"crossref","unstructured":"Ninaus G, Reinfrank F, Stettinger M, Felfernig A (2014) Content-based recommendation techniques for requirements engineering. In: 2014 IEEE 1st International Workshop on Artificial Intelligence for Requirements Engineering (AIRE), pp. 27\u201334. IEEE","DOI":"10.1109\/AIRE.2014.6894853"},{"key":"370_CR67","doi-asserted-by":"crossref","unstructured":"Nyamawe AS, Liu H, Niu N, Umer Q, Niu Z (2019) Automated recommendation of software refactorings based on feature requests. In: International Requirements Engineering Conference (RE), pp. 187\u2013198. IEEE","DOI":"10.1109\/RE.2019.00029"},{"issue":"5","key":"370_CR68","doi-asserted-by":"publisher","first-page":"4315","DOI":"10.1007\/s10664-020-09871-2","volume":"25","author":"AS Nyamawe","year":"2020","unstructured":"Nyamawe AS, Liu H, Niu N, Umer Q, Niu Z (2020) Feature requests-based recommendation of software refactorings. Emp Softw Eng 25(5):4315\u20134347","journal-title":"Emp Softw Eng"},{"key":"370_CR69","doi-asserted-by":"crossref","unstructured":"Palomares C, Franch X, Fucci D (2018) Personal recommendations in requirements engineering: the openreq approach. In: International working conference on requirements engineering: foundation for software quality, pp. 297\u2013304. Springer","DOI":"10.1007\/978-3-319-77243-1_19"},{"issue":"10","key":"370_CR70","doi-asserted-by":"publisher","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","volume":"22","author":"SJ Pan","year":"2009","unstructured":"Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans knowl data Eng 22(10):1345\u20131359","journal-title":"IEEE Trans knowl data Eng"},{"key":"370_CR71","doi-asserted-by":"crossref","unstructured":"Panichella A, Dit B, Oliveto R, Di\u00a0Penta M, Poshynanyk D, De\u00a0Lucia A (2013) How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 522\u2013531. IEEE","DOI":"10.1109\/ICSE.2013.6606598"},{"key":"370_CR72","doi-asserted-by":"crossref","unstructured":"Pawlik M, Augsten N (2011) Rted: A robust algorithm for the tree edit distance. In: Proceedings of the VLDB Endowment 5(4)","DOI":"10.14778\/2095686.2095692"},{"key":"370_CR73","doi-asserted-by":"crossref","unstructured":"Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 275\u2013281","DOI":"10.1145\/290941.291008"},{"key":"370_CR74","unstructured":"Post A, Fuhr T (2021) Case study: How well can ibm\u2019s\" requirements quality assistant\" review automotive requirements? In: REFSQ Workshops"},{"issue":"11","key":"370_CR75","first-page":"1016","volume":"8","author":"L Prechelt","year":"2002","unstructured":"Prechelt L, Malpohl G, Philippsen M et al (2002) Finding plagiarisms among a set of programs with jplag. J. UCS 8(11):1016","journal-title":"J. UCS"},{"issue":"4","key":"370_CR76","doi-asserted-by":"publisher","first-page":"2464","DOI":"10.1007\/s10664-017-9564-7","volume":"23","author":"C Ragkhitwetsagul","year":"2018","unstructured":"Ragkhitwetsagul C, Krinke J, Clark D (2018) A comparison of code similarity analysers. Emp Softw Eng 23(4):2464\u20132519","journal-title":"Emp Softw Eng"},{"key":"370_CR77","unstructured":"\u0158eh\u016f\u0159ek R, Sojka P (2010) Software Framework for Topic Modelling with Large Corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45\u201350. ELRA"},{"volume-title":"Recommend Syst Softw Eng","year":"2014","key":"370_CR78","unstructured":"Robillard MP, Maalej W, Walker RJ, Zimmermann T (eds) (2014) Recommend Syst Softw Eng. Springer, NY"},{"issue":"2","key":"370_CR79","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/s10664-008-9102-8","volume":"14","author":"P Runeson","year":"2009","unstructured":"Runeson P, H\u00f6st M (2009) Guidelines for conducting and reporting case study research in software engineering. Emp Softw Eng 14(2):131\u2013164","journal-title":"Emp Softw Eng"},{"key":"370_CR80","unstructured":"Rus V, Lintean M, Banjade R, Niraula NB, Stefanescu D (2013) Semilar: The semantic similarity toolkit. In: Proceedings of the 51st annual meeting of the association for computational linguistics: system demonstrations, pp. 163\u2013168"},{"key":"370_CR81","doi-asserted-by":"crossref","unstructured":"Samer R, Stettinger M, Atas M, Felfernig A, Ruhe G, Deshpande G (2019) New approaches to the identification of dependencies between requirements. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1265\u20131270. IEEE","DOI":"10.1109\/ICTAI.2019.00-91"},{"key":"370_CR82","first-page":"110748","volume":"170","author":"A Shatnawi","year":"2020","unstructured":"Shatnawi A, Seriai A, Sahraoui H, Ziadi T, Seriai A (2020) Reside: Reusable service identification from software families. JSS 170:110748","journal-title":"JSS"},{"key":"370_CR83","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1016\/j.jss.2016.07.039","volume":"131","author":"A Shatnawi","year":"2017","unstructured":"Shatnawi A, Seriai AD, Sahraoui H (2017) Recovering software product line architecture of a family of object-oriented product variants. J Syst Softw 131:325\u2013346","journal-title":"J Syst Softw"},{"key":"370_CR84","doi-asserted-by":"crossref","unstructured":"Shatnawi A, Ziadi T, Mohamadi MY (2019) Understanding source code variability in cloned android families: an empirical study on 75 families. In: 2019 26th Asia-Pacific Software Engineering Conference (APSEC), pp. 292\u2013299. IEEE","DOI":"10.1109\/APSEC48747.2019.00047"},{"issue":"4","key":"370_CR85","doi-asserted-by":"publisher","first-page":"341","DOI":"10.1016\/S1042-8143(89)80010-X","volume":"1","author":"ML Shaw","year":"1989","unstructured":"Shaw ML, Gaines BR (1989) Comparing conceptual structures: consensus, conflict, correspondence and contrast. Knowl Acquis 1(4):341\u2013363","journal-title":"Knowl Acquis"},{"key":"370_CR86","doi-asserted-by":"crossref","unstructured":"Tang W, Chen D, Luo P (2018) Bcfinder: A lightweight and platform-independent tool to find third-party components in binaries. In: 2018 25th Asia-Pacific Software Engineering Conference (APSEC), pp. 288\u2013297. IEEE","DOI":"10.1109\/APSEC.2018.00043"},{"key":"370_CR87","doi-asserted-by":"publisher","DOI":"10.1145\/3299771.3299774","volume-title":"An approach to identify use case scenarios from textual requirements specification. ISEC\u201919","author":"S Tiwari","year":"2019","unstructured":"Tiwari S, Ameta D, Banerjee A (2019) An approach to identify use case scenarios from textual requirements specification. ISEC\u201919. ACM, NY. https:\/\/doi.org\/10.1145\/3299771.3299774"},{"issue":"4","key":"370_CR88","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1145\/3381307.3381310","volume":"19","author":"A Walker","year":"2020","unstructured":"Walker A, Cerny T, Song E (2020) Open-source tools and benchmarks for code-clone detection: past, present, and future trends. ACM SIGAPP Appl Comput Rev 19(4):28\u201339","journal-title":"ACM SIGAPP Appl Comput Rev"},{"key":"370_CR89","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1016\/j.jss.2018.09.001","volume":"146","author":"B Wang","year":"2018","unstructured":"Wang B, Peng R, Li Y, Lai H, Wang Z (2018) Requirements traceability technologies and technology transfer decision support: a systematic review. J Syst Softw 146:59\u201379","journal-title":"J Syst Softw"},{"key":"370_CR90","doi-asserted-by":"crossref","unstructured":"Wang M, Wang P, Xu Y (2017) Ccsharp: An efficient three-phase code clone detector using modified pdgs. In: 2017 24th Asia-Pacific Software Engineering Conference (APSEC), pp. 100\u2013109. IEEE","DOI":"10.1109\/APSEC.2017.16"},{"key":"370_CR91","doi-asserted-by":"crossref","unstructured":"Wang W, Niu N, Liu H, Niu Z (2018) Enhancing automated requirements traceability by resolving polysemy. In: 2018 IEEE 26th International Requirements Engineering Conference (RE), pp. 40\u201351. IEEE","DOI":"10.1109\/RE.2018.00-53"},{"key":"370_CR92","doi-asserted-by":"crossref","unstructured":"White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: International Conference on Automated Software Engineering (ASE), pp. 87\u201398. IEEE","DOI":"10.1145\/2970276.2970326"},{"key":"370_CR93","doi-asserted-by":"publisher","first-page":"136","DOI":"10.1016\/j.scico.2014.11.013","volume":"101","author":"R Wieringa","year":"2015","unstructured":"Wieringa R, Daneva M (2015) Six strategies for generalizing software engineering theories. Sci Comp Programm 101:136\u2013152","journal-title":"Sci Comp Programm"},{"key":"370_CR94","doi-asserted-by":"publisher","DOI":"10.1145\/3444689","author":"L Zhao","year":"2021","unstructured":"Zhao L, Alhoshan W, Ferrari A, Letsholo KJ, Ajagbe MA, Chioasca EV, Batista-Navarro RT (2021) Natural language processing for requirements engineering: a systematic mapping study. ACM Comput Surv. https:\/\/doi.org\/10.1145\/3444689","journal-title":"ACM Comput Surv"},{"key":"370_CR95","doi-asserted-by":"crossref","unstructured":"Ziadi T, Frias L, da\u00a0Silva MAA, Ziane M (2012) Feature identification from the source code of product variants. In: 2012 16th European Conference on Software Maintenance and Reengineering, pp. 417\u2013422. IEEE","DOI":"10.1109\/CSMR.2012.52"}],"updated-by":[{"updated":{"date-parts":[[2022,2,26]],"date-time":"2022-02-26T00:00:00Z","timestamp":1645833600000},"DOI":"10.1007\/s00766-022-00376-6","type":"correction","label":"Correction"}],"container-title":["Requirements Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00766-021-00370-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00766-021-00370-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00766-021-00370-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,5]],"date-time":"2022-03-05T14:03:10Z","timestamp":1646488990000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00766-021-00370-4"}},"subtitle":["A case study in the railway domain"],"short-title":[],"issued":{"date-parts":[[2022,1,18]]},"references-count":95,"alternative-id":["370"],"URL":"https:\/\/doi.org\/10.1007\/s00766-021-00370-4","relation":{},"ISSN":["0947-3602","1432-010X"],"issn-type":[{"type":"print","value":"0947-3602"},{"type":"electronic","value":"1432-010X"}],"subject":[],"published":{"date-parts":[[2022,1,18]]},"assertion":[{"value":"11 June 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 December 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 January 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 February 2022","order":4,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Correction","order":5,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"A Correction to this paper has been published:","order":6,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"https:\/\/doi.org\/10.1007\/s00766-022-00376-6","URL":"https:\/\/doi.org\/10.1007\/s00766-022-00376-6","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}}]}}