Abstract
[Context and motivation] System requirements are normally provided in the form of natural language documents. Such documents need to be properly structured, in order to ease the overall uptake of the requirements by the readers of the document. A structure that allows a proper understanding of a requirements document shall satisfy two main quality attributes: (i) requirements relatedness: each requirement is conceptually connected with the requirements in the same section; (ii) sections independence: each section is conceptually separated from the others. [Question/Problem] Automatically identifying the parts of the document that lack requirements relatedness and sections independence may help improve the document structure. [Principal idea/results] To this end, we define a novel clustering algorithm named Sliding Head-Tail Component (S-HTC). The algorithm groups together similar requirements that are contiguous in the requirements document. We claim that such algorithm allows discovering the structure of the document in the way it is perceived by the reader. If the structure originally provided by the document does not match the structure discovered by the algorithm, hints are given to identify the parts of the document that lack requirements relatedness and sections independence. [Contribution] We evaluate the effectiveness of the algorithm with a pilot test on a requirements standard of the railway domain (583 requirements).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Achananuparp, P., Hu, X., Shen, X.: The evaluation of sentence similarity measures. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 305–316. Springer, Heidelberg (2008)
Berry, D.M., Bucchiarone, A., Gnesi, S., Lami, G., Trentanni, G.: A new quality model for natural language requirements specifications. In: Proc. of REFSQ 2006, pp. 115–128 (2006)
CENELEC: EN 50128, Railway applications - Communications, signalling and processing systems - Software for railway control and protection systems (2011)
Cleland-Huang, J., Czauderna, A., Gibiec, M., Emenecker, J.: A machine learning approach for tracing regulatory codes to product specific requirements. In: Proc. of ICSE 2010, vol. 1, pp. 155–164. ACM, New York (2010)
Natt och Dag, J., Gervasi, V., Brinkkemper, S., Regnell, B.: A linguistic-engineering approach to large-scale requirements management. IEEE Software 22, 32–39 (2005)
Falessi, D., Cantone, G., Canfora, G.: Empirical principles and an industrial case study in retrieving equivalent requirements via natural language processing techniques. IEEE Transactions on Software Engineering PP(99) (2011)
Ferrari, A., Gnesi, S., Tolomei, G.: A clustering-based approach for discovering flaws in requirements specifications. In: Proceedings of ACM SAC 2012, pp. 1043–1050 (2012)
Gervasi, V., Nuseibeh, B.: Lightweight validation of natural language requirements. Software: Practice and Experience 32(2), 113–133 (2002)
Hayes, J.H., Dekhtyar, A., Sundaram, S.K.: Advancing candidate link generation for requirements tracing: The study of methods. IEEE Trans. Software Eng. 32(1), 4–19 (2006)
IEEE: Std 830-1998 - Recommended Practice for Software Requirements Specifications (1998)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)
Lucchese, C., Orlando, S., Perego, R., Silvestri, F., Tolomei, G.: Identifying task-based sessions in search engine query logs. In: Proc. of WSDM 2011, pp. 277–286. ACM, New York City (2011)
Mao, S., Rosenfeld, A., Kanungo, T.: Document structure analysis algorithms: a literature survey. In: Proc. of DRR 2003, pp. 197–207 (2003)
MIL: Std 498 - Software Development and Documentation (1994)
Park, S., Kim, H., Ko, Y., Seo, J.: Implementation of an efficient requirements-analysis supporting system using similarity measure techniques. IST 42, 429–438 (2000)
Pohl, K.: Requirements Engineering: Fundamentals, Principles, and Techniques. Springer (2010)
Rauf, R., Antkiewicz, M., Czarnecki, K.: Logical structure extraction from software requirements documents. In: Proc. of IEEE RE 2011, pp. 101–110. IEEE Computer Society, Washington, DC (2011)
Tan, P., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Boston (2005)
UIC - International Union of Railways: EIRENE Functional Requirements Specification v.7 (2006), http://www.uic.org/IMG/pdf/EIRENE_FRS_v7.pdf
Wilson, W.M., Rosenberg, L.H., Hyatt, L.E.: Automated analysis of requirement specifications. In: Proc. of ICSE 1997, pp. 161–171. ACM Press, New York (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ferrari, A., Gnesi, S., Tolomei, G. (2013). Using Clustering to Improve the Structure of Natural Language Requirements Documents. In: Doerr, J., Opdahl, A.L. (eds) Requirements Engineering: Foundation for Software Quality. REFSQ 2013. Lecture Notes in Computer Science, vol 7830. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37422-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-37422-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37421-0
Online ISBN: 978-3-642-37422-7
eBook Packages: Computer ScienceComputer Science (R0)