Abstract
In the railway safety-critical domain requirements documents have to abide to strict quality criteria. Rule-based natural language processing (NLP) techniques have been developed to automatically identify quality defects in natural language requirements. However, the literature is lacking empirical studies on the application of these techniques in industrial settings. Our goal is to investigate to which extent NLP can be practically applied to detect defects in the requirements documents of a railway signalling manufacturer. To address this goal, we first identified a set of typical defects classes, and, for each class, an engineer of the company implemented a set of defect-detection patterns by means of the GATE tool for text processing. After a preliminary analysis, we applied the patterns to a large set of 1866 requirements previously annotated for defects. The output of the patterns was further inspected by two domain experts to check the false positive cases. Additional discard-patterns were defined to automatically remove these cases. Finally, SREE, a tool that searches for typically ambiguous terms, was applied to the requirements. The experiments show that SREE and our patterns may play complementary roles in the detection of requirements defects. This is one of the first works in which defect detection NLP techniques are applied on a very large set of industrial requirements annotated by domain experts. We contribute with a comparison between traditional manual techniques used in industry for requirements analysis, and analysis performed with NLP. Our experience shows that several discrepancies can be observed between the two approaches. The analysis of the discrepancies offers hints to improve the capabilities of NLP techniques with company specific solutions, and suggests that also company practices need to be modified to effectively exploit NLP tools.
Similar content being viewed by others
Notes
The standard is currently replaced by ISO/IEC/IEEE 29148:2011 (ISO 2011).
In this context, we consider as a pattern i also a dictionary from SREE-reduced, as defined in Section 3.4
The dataset appears balanced since VE1 continued to randomly select new requirements from the original requirements considered, until a balanced number of accepted and rejected requirements was obtained.
According to Landis and Koch (1977), the following qualitative measures are associated to the different ranges of the Cohen’s Kappa: k < 0, no agreement; 0 ≤ k ≤ 0.20, slight; 0.21 ≤ k ≤ 0.40, fair; 0.41 ≤ k ≤ 0.60, moderate; 0.61 ≤ k ≤ 0.80 substantial; and 0.81 ≤ k ≤ 1 almost perfect agreement.
The results presented in Tables 9 and 8 differ from those presented in our original conference paper. When VE2 replicated the experiments performed by VE1, discrepancies in the results emerged. These were traced back to the usage of a support tool, developed by VE1 on top of GATE, to ease the analysis of the requirements. The tool introduced further manipulations, which led to incorrect numerical results. The results presented in this paper are produced based solely on the analysis of the output of GATE, and are, to the best of our knowledge, correct.
The requirement was not rejected since it was clarified by other subsequent requirements. This violates the guideline (c) that require requirements to be stand-alone, but the defect was not considered crucial.
The value of pR that considers the analysis of the false positive cases for the SREE dictionaries cannot be provided, since we analysed only a subset of the defects for the plurals class. However, the average value of pD gives a clear indication of the precision of SREE at the level of defects.
References
Alvarez SA (2002) An exact analytical relation among recall, precision, and classification accuracy in information retrieval. Tech. Rep BCCS-02-01. Computer Science Department, Boston College
Ambriola V, Gervasi V (2006) On the systematic analysis of natural language requirements with Circe. Autom Softw Eng 13(1):107–167
Anda B, Sjøberg DI (2002) Towards an inspection technique for use case models. In: Proceedings of the 14th international conference on software engineering and knowledge engineering (SEKE’02). ACM, pp 127–134
Arora C, Sabetzadeh M, Briand L, Zimmer F (2015) Automated checking of conformance to requirements templates using natural language processing. IEEE Trans Softw Eng 41(10):944–968
Aurum A, Petersson H, Wohlin C (2002) State-of-the-art: software inspections after 25 years. Softw Test Verif Reliab 12(3):133–154
Baskerville RL, Wood-Harper AT (1996) A critical perspective on action research as a method for information systems research. J Inf Technol 11(3):235–246
Berry DM, Kamsties E (2005) The syntactically dangerous all and plural in specifications. IEEE Softw 22(1):55–57
Berry DM, Kamsties E, Krieger MM (2003) From contract drafting to software specification: Linguistic sources of ambiguity. https://cs.uwaterloo.ca/~dberry/handbook/ambiguityHandbook.pdf
Berry D, Gacitua R, Sawyer P, Tjong SF (2012) The case for dumb requirements engineering tools. In: Proceedings of the 18th international working conference on requirements engineering: foundation for software quality (REFSQ’12), vol 7195. Springer, LNCS, pp 211–217
Berry DM, Cleland-Huang J, Ferrari A, Maalej W, Mylopoulos J, Zowghi D (2017) Panel: context-dependent evaluation of tools for nl re tasks: Recall vs. precision, and beyond. In: 2017 IEEE 25th International requirements engineering conference (RE), pp 570–573. https://doi.org/10.1109/RE.2017.64
Bonin F, Dell’Orletta F, Venturi G, Montemagni S (2010) A contrastive approach to multi-word term extraction from domain corpora. In: Proceedings of the 7th International conference on language resources and evaluation (LREC’10), pp 19–21
Casamayor A, Godoy D, Campo M (2012) Functional grouping of natural language requirements for assistance in architectural software design. KBS 30:78–86
CENELEC (2011) EN 50128:2011: railway applications - communication, signalling and processing systems - software for railway control and protection systems. Tech. rep.
Chantree F, Nuseibeh B, Roeck AND, Willis A (2006) Identifying nocuous ambiguities in natural language requirements. In: Proceedings of the 14th IEEE international requirements engineering conference (RE’06). IEEE, pp 56–65
Cleland-Huang J, Czauderna A, Gibiec M, Emenecker J (2010) A machine learning approach for tracing regulatory codes to product specific requirements. In: ICSE (1). ACM, pp 155–164
Collins-Thompson K (2014) Computational assessment of text readability: a survey of current and future research. ITL-Int J Appl Linguist 165(2):97–135
Cunningham H (2002) GATE, a general architecture for text engineering. Comput Human 36(2):223–254
Cutts M (1996) The plain English guide. Oxford University Press
Derczynski L, Maynard D, Rizzo G, van Erp M, Gorrell G, Troncy R, Petrak J, Bontcheva K (2015) Analysis of named entity recognition and linking for tweets. Inf Process Manag 51(2):32–49
Fabbrini F, Fusani M, Gnesi S, Lami G (2001) The linguistic approach to the natural language requirements quality: benefit of the use of an automatic tool. In: Proceedings of the 26th Annual NASA Goddard software engineering workshop. IEEE, pp 97–105
Fagan ME (1976) Design and code inspections to reduce errors in program development. IBM Syst J 15(3):182–211
Falessi D, Cantone G, Canfora G (2013) Empirical principles and an industrial case study in retrieving equivalent requirements via natural language processing techniques. IEEE Trans Softw Eng 39(1):18–44
Femmer H, Kučera J, Vetrò A (2014) On the impact of passive voice requirements on domain modelling. In: Proceedings of the 8th ACM / IEEE international symposium on empirical software engineering and measurement (ESEM’14), Art. 21. ACM
Femmer H, Fernández DM, Wagner S, Eder S (2017) Rapid quality assurance with requirements smells. J Syst Softw 123:190–213
Ferrari A, Gnesi S (2012) Using collective intelligence to detect pragmatic ambiguities. In: Proceedings of the 20th IEEE international requirements engineering conference (RE’12). IEEE, pp 191–200
Ferrari A, dell’Orletta F, Spagnolo GO, Gnesi S (2014) Measuring and improving the completeness of natural language requirements. In: Proceedings of the 20th international working conference on requirements engineering: foundation for software quality (REFSQ’14). Springer, pp 23–38
Ferrari A, Spoletini P, Gnesi S (2016) Ambiguity and tacit knowledge in requirements elicitation interviews. Requir Eng 21(3):333–355
Ferrari A, Dell’Orletta F, Esuli A, Gervasi V, Gnesi S (2017) Natural language requirements processing: a 4D vision. IEEE Software (to appear)
Gacitua R, Sawyer P, Gervasi V (2010) On the effectiveness of abstraction identification in requirements engineering. In: Proceedings of the 18th IEEE international requirements engineering conference (RE’10). IEEE, pp 5–14
Gervasi V, Zowghi D (2005) Reasoning about inconsistencies in natural language requirements. ACM Trans Softw Eng Methodol 14(3):277–330
Ghaisas S, Rose P, Daneva M, Sikkel K, Wieringa RJ (2013) Generalizing by similarity: Lessons learnt from industrial case studies. In: Proceedings of the 1st international workshop on conducting empirical studies in industry. IEEE Press, pp 37–42
Gleich B, Creighton O, Kof L (2010) Ambiguity detection: towards a tool explaining ambiguity sources. In: Proceedings of the 16th international working conference on requirements engineering: foundation for software quality (REFSQ’10), vol 6182. Springer, LNCS, pp 218–232
Gnesi S, Lami G, Trentanni G (2005) An automatic tool for the analysis of natural language requirements. Int J Comput Syst Sci Eng 20(1):53–62
Gorschek T, Garre P, Larsson S, Wohlin C (2006) A model for technology transfer in practice. IEEE Softw 23(6):88–95
Goth G (2016) Deep or shallow, nlp is breaking out. Commun ACM 59(3):13–16
IEEE (1998) IEEE guide for developing system requirements specifications. IEEE Std 1233, 1998 Edition, pp 1–36, https://doi.org/10.1109/IEEESTD.1998.88826
ISO IEC, IEEE (2011) ISO/IEC/IEEE international standard - systems and software engineering – life cycle processes –requirements engineering. ISO/IEC/IEEE 29148:2011(E), pp 1–94, https://doi.org/10.1109/IEEESTD.2011.6146379
Kamsties E (2005) Understanding ambiguity in requirements engineering. In: Engineering and managing software requirements. Springer, Berlin, pp 245–266
Kamsties E, Berry DM, Paech B (2001) Detecting ambiguities in requirements documents using inspections. In: Proceedings of the 1st workshop on inspection in software engineering (WISE’01), pp 68–80
Kang N, van Mulligen EM, Kors JA (2011) Comparing and combining chunkers of biomedical text. J Biomed Inform 44(2):354–360
Kassab M, Neill C, Laplante P (2014) State of practice in requirements engineering: contemporary data. Innov Syst Softw Eng 10(4):235–241
Kiyavitskaya N, Zeni N, Mich L, Berry DM (2008) Requirements for tools for ambiguity identification and measurement in natural language requirements specifications. Requir Eng 13(3):207–239
Kof L (2008) From textual scenarios to message sequence charts: inclusion of condition generation and actor extraction. In: Proceedings of the 16th IEEE international requirements engineering conference, (RE’08). IEEE, pp 331–332
Kof L (2009) Translation of textual specifications to automata by means of discourse context modeling. In: Proceedings of the 15th international working conference on requirements engineering: foundation for software quality (REFSQ’09), vol 5512. Springer, LNCS, pp 197–211
Kof L (2010) From requirements documents to system models: a tool for interactive semi-automatic translation. In: Proceedings of the 18th IEEE international requirements engineering conference (RE’10). IEEE, pp 391–392
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics, 159–174
Lian X, Rahimi M, Cleland-Huang J, Zhang L, Ferrari R, Smith M (2016) Mining requirements knowledge from collections of domain documents. In: Proceedings of the 24th IEEE international requirements engineering conference (RE’16). IEEE, pp 156–165
Maalej W, Nabil H (2015) Bug report, feature request, or simply praise? On automatically classifying app reviews. In: Proceedings of the 23rd IEEE international requirements engineering conference, (RE’15). IEEE, pp 116–125
Manning CD (2011) Part-of-speech tagging from 97% to 100%: is it time for some linguistics? In: Proceedings of the 12th international conference on intelligent text processing and computational linguistics (CICLing’11), LNCS, vol 6608. Springer, pp 171–189
Mavin A, Wilkinson P, Harwood A, Novak M (2009) Easy approach to requirements syntax (ears). In: Proceedings of the 17th IEEE international requirements engineering conference (RE’09). IEEE, pp 317–322
Mavin A, Wilksinson P, Gregory S, Uusitalo E (2016) Listens learned (8 lessons learned applying EARS). In: Proceedings of the 24th IEEE international requirements engineering conference (RE’16). IEEE, pp 276–282
Mich L (1996) NL-OOPS: from natural language to object oriented requirements using the natural language processing system LOLITA. NLE 2(2):161–187
Mich L, Franch M, Inverardi PN (2004) Market research for requirements analysis using linguistic tools. Requir Eng 9(1):40–56
Pohl K, Rupp C (2011) Requirements engineering fundamentals. Rocky Nook, Inc
Quirchmayr T, Paech B, Kohl R, Karey H (2017) Semi-automatic software feature-relevant information extraction from natural language user manuals. In: Proceedings of the 23rd international working conference on requirements engineering: foundation for software quality (REFSQ’17). Springer, pp 255–272
Robeer M, Lucassen G, van der Werf JME, Dalpiaz F, Brinkkemper S (2016) Automated extraction of conceptual models from user stories via nlp. In: Proceedings of the 24th IEEE international requirements engineering conference (RE’16). IEEE, pp 196–205
Rosadini B, Ferrari A, Gori G, Fantechi A, Gnesi S, Trotta I, Bacherini S (2017) Using NLP to detect requirements defects: an industrial experience in the railway domain. In: Proceedings of the 23rd international working conference on requirements engineering: foundation for software quality (REFSQ’17). LNCS, vol 10153, pp 344–360
Rosenberg LH, Hammer F, Huffman LL (1998) Requirements, testing and metrics. In: In 15th Annual pacific northwest software quality conference
RTCA Inc, EUROCAE (2012) DO-178C: software considerations in airborne systems and equipment certification. Tech. rep.
Runeson P, Host M, Rainer A, Regnell B (2012) Case study research in software engineering: guidelines and examples. Wiley
Shull F, Rus I, Basili V (2000) How perspective-based reading can improve requirements inspections. IEEE Comput 33(7):73–79
Sultanov H, Hayes JH (2013) Application of reinforcement learning to requirements engineering: requirements tracing. In: Proceedings of the 21st IEEE international requirements engineering conference (RE’13). IEEE, pp 52–61
Terzakis J, Gregory S (2016) Ramp: requirements authors mentoring program. In: Proceedings of the 24th IEEE international requirements engineering conference (RE’16). IEEE, pp 323–328
Tjong SF, Berry DM (2013) The design of SREE: a prototype potential ambiguity finder for requirements specifications and lessons learned. In: Proceedings of the 19th international working conference on requirements engineering: foundation for software quality (REFSQ’13), vol 7830. Springer, LNCS, pp 80–95
Wieringa R, Daneva M (2015) Six strategies for generalizing software engineering theories. Sci Comput Program 101:136–152
Wilmink M, Bockisch C (2017) On the ability of lightweight checks to detect ambiguity in requirements documentation. In: Proceedings of the 23rd international working conference on requirements engineering: foundation for software quality (REFSQ’17), vol 10153. Springer International Publishing, LNCS, pp 327–343
Wilson WM, Rosenberg LH, Hyatt LE (1997) Automated analysis of requirement specifications. In: Proceedings of the 19th international conference on software engineering. ACM, pp 161–171
Yang H, Roeck AND, Gervasi V, Willis A, Nuseibeh B (2011) Analysing anaphoric ambiguity in natural language requirements. Requir Eng 16(3):163–189
Yin RK (2013) Case study research: design and methods. Sage Publications
Yue T, Briand LC, Labiche Y (2015) atoucan: an automated framework to derive uml analysis models from use case models. ACM Trans Softw Eng Methodol (TOSEM) 24(3):13
Zhang H, Yue T, Ali S, Liu C (2016) Towards mutation analysis for use cases. In: Proceedings of the ACM/IEEE 19th international conference on model driven engineering languages and systems. ACM, pp 363–373
Zowghi D, Gervasi V, McRae A (2001) Using default reasoning to discover inconsistencies in natural language requirements. In: Proceedings of the 8th Asia-Pacific software engineering conference (APSEC’01), pp 133–140
Acknowledgments
The authors would like to thank the anonymous reviewers for their precious recommendations, which contributed to make the paper clearer and more complete. We are also extremely grateful to Daniel M. Berry, for providing the dictionaries of SREE, and to Daniel Méndez Fernández for his valuable suggestions on reporting case studies in software engineering. This work has been partially funded by the ASTRail project. This project received funding from the Shift2Rail Joint Undertaking under the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777561. The content of this paper reflects only the authors’ view and the Shift2Rail Joint Undertaking is not responsible for any use that may be made of the included information.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Anna Perini and Paul Grünbacher
Rights and permissions
About this article
Cite this article
Ferrari, A., Gori, G., Rosadini, B. et al. Detecting requirements defects with NLP patterns: an industrial experience in the railway domain. Empir Software Eng 23, 3684–3733 (2018). https://doi.org/10.1007/s10664-018-9596-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-018-9596-7