Detecting, classifying, and tracing non-functional software requirements | Requirements Engineering Skip to main content
Log in

Detecting, classifying, and tracing non-functional software requirements

  • RE 2015
  • Published:
Requirements Engineering Aims and scope Submit manuscript

Abstract

In this paper, we describe a novel unsupervised approach for detecting, classifying, and tracing non-functional software requirements (NFRs). The proposed approach exploits the textual semantics of software functional requirements (FRs) to infer potential quality constraints enforced in the system. In particular, we conduct a systematic analysis of a series of word similarity methods and clustering techniques to generate semantically cohesive clusters of FR words. These clusters are classified into various categories of NFRs based on their semantic similarity to basic NFR labels. Discovered NFRs are then traced to their implementation in the solution space based on their textual semantic similarity to source code artifacts. Three software systems are used to conduct the experimental analysis in this paper. The results show that methods that exploit massive sources of textual human knowledge are more accurate in capturing and modeling the notion of similarity between FR words in a software system. Results also show that hierarchical clustering algorithms are more capable of generating thematic word clusters than partitioning clustering techniques. In terms of performance, our analysis indicates that the proposed approach can discover, classify, and trace NFRs with accuracy levels that can be adequate for practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://www.bluebit.gr/net/.

  2. http://wn-similarity.sourceforge.net/.

  3. https://developers.google.com/web-search/docs/.

  4. http://en.wikipedia.org/wiki/Wikipedia:Database_download.

  5. http://wordlist.aspell.net/.

References

  1. Abadi A, Nisenson M, Simionovici Y (2008) A traceability technique for specifications. In: International conference on program comprehension, pp 103–112

  2. Aggarwal C, Zhai C (2012) A survey of text clustering algorithms. Mining text data. Springer, Newyork, pp 77–128

  3. Anquetil N, Fourrier C, Lethbridge T (1999) Experiments with clustering as a software remodularization method. In: Working conference on reverse engineering, pp 235–255

  4. Anquetil N, Lethbridge T (1998) Assessing the relevance of identifier names in a legacy software system. In: Conference of the centre for advanced studies on collaborative research, pp 4–14

  5. Antoniol1 G, Guéhéneuc Y, Merlo E, Tonella P (2007) Mining the lexicon used by programmers during software evolution. In: International conference on software maintenance, pp 14–23

  6. Arthur D, Vassilvitskii S (2007) K-means++: the advantages of careful seeding. In: Annual ACM-SIAM symposium on discrete algorithms, pp 1027–1035

  7. Bekkerman R, El-Yaniv R, Tishby N, Winter Y (2003) Distributional word clusters vs. words for text categorization. J Mach Learn Res 3:1183–1208

    MATH  Google Scholar 

  8. Bollegala D, Matsuo Y, Ishizuka M (2007) Measuring semantic similarity between words using web search engines. In: International conference on world wide web, pp 757–766

  9. Budiu R, Royer C, Pirolli P (2007) Modeling information scent: a comparison of LSA, PMI and GLSA similarity measures on common tests and corpora. In: Large scale semantic access to content (text, image, video, and sound), pp 314–332

  10. Bullinaria J, Levy J (2007) Extracting semantic representations from word co-occurrence statistics: a computational study. Behav Res Methods 39(3):510–526

    Article  Google Scholar 

  11. van Rijsbergen CJ (1979) Information retrieval. Butterworths, New York

    MATH  Google Scholar 

  12. Carreńo G, Winbladh K (2013) Analysis of user comments: an approach for software requirements evolution. In: International conference on software engineering, pp 343–348

  13. Casamayor A, Godoy D, Campo M (2010) Identification of non-functional requirements in textual specifications: a semi-supervised learning approach. Inf Softw Technol 52(4):436–445

    Article  Google Scholar 

  14. Chang J, Boyd-Graber J, Gerrish S, Wang C, Blei D (2009) Reading tea leaves: how humans interpret topic models. Curran Associates, County Down, pp 288–296

  15. Chen J, Ren Y, Riedl J (2010) The effects of diversity on group productivity and member withdrawal in online volunteer groups. In: SIGCHI conference on human factors in computing systems, pp 821–830

  16. Chung L, do Prado Leite J (2009) On non-functional requirements in software engineering. Concept Model Found Appl Lecture Notes Comput Sci 5600:363–379

    Article  Google Scholar 

  17. Chung L, Nixon B, Yu E, Mylopoulos J (2000) Non-functional requirements in software engineering. Kluwer Academic, Boston

    Book  MATH  Google Scholar 

  18. Church K, Hanks P (1990) Word association norms, mutual information, and lexicography. Comput Ling 16(1):22–29

    Google Scholar 

  19. Cilibrasi R, Vitanyi P (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383

    Article  Google Scholar 

  20. Cleland-Huang J, Chang C, Christensen M (2003) Event-based traceability for managing evolutionary change. IEEE Trans Softw Eng 29(9):796–810

    Article  Google Scholar 

  21. Cleland-Huang J, Heimdahl M, Huffman-Hayes J, Lutz R, Mäder P (2012) Trace queries for safety requirements in high assurance systems. In: International conference on requirements engineering: foundation for software quality, pp 179–193

  22. Cleland-Huang J, Schmelzer D (2003) Dynamically tracing non-functional requirements through design pattern invariants. In: Workshop on traceability in emerging forms of software tracing non-functional requirements

  23. Cleland-Huang J, Settimi R, BenKhadra O, Berezhanskaya E, Christina S (2005) Goal-centric traceability for managing non-functional requirements. In: International conference on software engineering, pp 362–371

  24. Cleland-Huang J, Settimi R, Zou X, Solc P (2007) Automated classification of non-functional requirements. Requir Eng 12(2):103–120

    Article  Google Scholar 

  25. Cysneiros LM (2007) Evaluating the effectiveness of using catalogues to elicit nonfunctional requirements. In: Workshop em Engenharia de Requisitos, pp 107–115

  26. De Lucia A, Oliveto R, Sgueglia P (2006) Incremental approach and user feedbacks: a silver bullet for traceability recovery. In: International conference on software maintenance, pp 299–309

  27. De Lucia A, Oliveto R, Tortora G (2009) Assessing IR-based traceability recovery tools through controlled experiments. Empir Softw Eng 14(1):57–92

    Article  Google Scholar 

  28. Dean A, Voss D (1999) Design and analysis of experiments. Springer, New York

    Book  MATH  Google Scholar 

  29. Deerwester S, Dumais S, Furnas G, Landauer T, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  30. Deißenböck F, Pizka M (2005) Concise and consistent naming. In: International workshop on program comprehension, pp 97–106

  31. Demmel J, Kahan W (1990) Accurate singular values of bidiagonal matrices. J Sci Stat Comput 11(5):873–912

    Article  MathSciNet  MATH  Google Scholar 

  32. Fellbaum C (1998) WordNet: an electronic lexical database. MIT Press, Cambridge

    MATH  Google Scholar 

  33. Funahashi T, Yamana H (2010) Reliability verification of search engines’ hit counts: How to select a reliable hit count for a query. In: International conference on current trends in web engineering, pp 114–125

  34. Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: International joint conference on artificial intelligence, pp 1606–1611

  35. Glinz M (2007) On non-functional requirements. In: IEEE international requirements engineering conference, pp 21–26

  36. Goldin L, Berry D (1997) AbstFinder, a prototype natural language text abstraction finder for use in requirements elicitation. Autom Softw Eng 4(4):375–412

    Article  Google Scholar 

  37. Gotel O, Cleland-Huang J, Huffman-Hayes J, Zisman A, Egyed A, Grnbacher P, Dekhtyar A, Antoniol G, Maletic J (2012) The grand challenge of traceability (v1.0). In: Software and systems traceability. Springer, London

  38. Gracia J, Trillo R, Espinoza M, Mena E (2006) Querying the web: a multiontology disambiguation method. In: International conference on web engineering, pp 241–248

  39. Gross D, Yu E (2000) From non-functional requirements to design through patterns. Requir Eng 6(1):18–36

    Article  MATH  Google Scholar 

  40. Guo W, Li H, Ji H, Diab M (2013) Linking tweets to news: a framework to enrich short text data in social media. In: Annual meeting of the association for computational linguistics, pp 239–249

  41. Hearst M, Pedersen J (1996) Reexamining the cluster hypothesis: scatter/gather on retrieval results. In: International ACM SIGIR conference on Research and development in information retrieval, pp 76–84

  42. Hill E, Binkley D, Lawrie D, Pollock L, Vijay-Shanker K (2014) An empirical study of identifier splitting techniques. Empir Softw Eng 19(6):1754–1780

    Article  Google Scholar 

  43. Hill E, Fry Z, Boyd H, Sridhara G, Novikova Y, Pollock L, Vijay-Shanker K (2008) Amap: Automatically mining abbreviation expansions in programs to enhance software maintenance tools. In: International working conference on mining software repositories, pp 79–88

  44. Holzinger A, Yildirim P, Geier M, Simonic KM (2013) Quality-based knowledge discovery from medical text on the web. In: Pasi G, Bordogna G, Jain L (eds) Quality issues in the management of web information. Springer, Berlin, pp 145–158

    Chapter  Google Scholar 

  45. Huffman-Hayes J, Dekhtyar A, Sundaram S (2006) Advancing candidate link generation for requirements tracing: the study of methods. IEEE Trans Softw Eng 32(1):4–19

    Article  Google Scholar 

  46. Kassab M, Ormandjieva O, Daneva M (2009) A metamodel for tracing non-functional requirements. In: World congress on computer science and information engineering, pp 687–694

  47. Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York

    Book  Google Scholar 

  48. Kotonya G, Sommerville I (1998) Requirements engineering: processes and techniques. Wiley, New York

    Google Scholar 

  49. Kuhn A, Ducasse S, Gîrba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49(3):230–243

    Article  Google Scholar 

  50. Landauer T, Dutnais S (1997) A solution to plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev 104(2):211–240

    Article  Google Scholar 

  51. Lau J, Newman D, Karimi S, Baldwin T (2010) Best topic word selection for topic labelling. In: International conference on computational linguistics, pp 605–613

  52. Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. MIT Press, Cambridge

    Google Scholar 

  53. Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Annual international conference on systems documentation, pp 24–26

  54. Lo D, Nagappan N, Zimmermann T (2015) How practitioners perceive the relevance of software engineering research. In: Joint meeting on foundations of software engineering, pp 415–425

  55. Lohar S, Amornborvornwong S, Zisman A, Cleland-Huang J (2013) Improving trace accuracy through data-driven configuration and composition of tracing features. In: Joint meeting on foundations of software engineering, pp 378–388

  56. Luisa M, Mariangela F, NoviInverardi P (2004) Market research for requirements analysis using linguistic tools. Requir Eng 9(1):40–56

    Article  Google Scholar 

  57. Lund K, Burgess C (1996) Producing high-dimensional semantic spaces from lexical co-occurrence. Behav Res Methods Instrum Comput 28(2):203–208

    Article  Google Scholar 

  58. Maalej W, Nabil H (2015) Bug report, feature request, or simply praise? On automatically classifying app reviews. In: Requirements engineering conference, pp 116–125

  59. Mahmoud A (2015) An information theoretic approach for extracting and tracing non-functional requirements. In: International requirements engineering conference

  60. Mahmoud A, Niu N (2015) On the role of semantics in automated requirements tracing. Requir Eng 20(3):281–300

    Article  Google Scholar 

  61. Mihalcea R, Corley C, Strapparava C (2006) Corpus-based and knowledge-based measures of text semantic similarity. In: National conference on artificial intelligence, pp 775–780

  62. Mimno D, Wallach H, Talley E, Leenders M, McCallum A (2011) Optimizing semantic coherence in topic models. In: The conference on empirical methods in natural language processing, pp 262–272

  63. Mirakhorli M, Cleland-Huang J (2012) Tracing non-functional requirements. In: Zisman A, Cleland-Huang J, Gotel O (eds) Software and systems traceability. Springer, New York, pp 299–320

  64. Mylopoulos J, Chung L, Nixon B (1992) Representing and using nonfunctional requirements: a process-oriented approach. IEEE Trans Softw Eng 18(6):483–497

    Article  Google Scholar 

  65. Newman D, Han Lau J, Grieser K, Baldwin T (2010) Automatic evaluation of topic coherence. In: Annual conference of the North American chapter of the association for computational linguistics, pp 100–108

  66. Newman D, Noh Y, Talley E, Karimi S, Baldwin T (2010) Evaluating topic models for digital libraries. In: Annual joint conference on digital libraries, pp 215–224

  67. Niu N, Mahmoud A (2012) Enhancing candidate link generation for requirements tracing: the cluster hypothesis revisited. In: IEEE international requirements engineering conference, pp 81–90

  68. Nuseibeh B (2001) Weaving together requirements and architectures. Computer 34(3):115–119

    Article  Google Scholar 

  69. Oliveto R, Gethers M, Poshyvanyk D, De Lucia A (2010) On the equivalence of information retrieval methods for automated traceability link recovery. In: International conference on program comprehension, pp 68–71

  70. Peraldi Frati MA, Albinet A (2010) Requirement traceability in safety critical systems. In: Workshop on critical automotive applications: robustness and safety, pp 11–14

  71. Pollock L (2012) Leveraging natural language analysis of software: achievements, challenges, and opportunities. In: IEEE international conference on software maintenance, pp 4–4

  72. Pollock L, Vijay-Shanker K, Hill E, Sridhara G, Shepherd D (2013) Natural language-based software analyses and tools for software maintenance, Lecture notes in computer science, vol 7171. Springer, Berlin, pp 94–125

  73. Porter F (1997) An algorithm for suffix stripping. Morgan Kaufmann Publishers Inc, Burlington

    Google Scholar 

  74. Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: International joint conference on artificial intelligence, pp 448–453

  75. Rosario B (2000) Latent semantic indexing: an overview. INFOSYS 240 Spring Paper, University of California, Berkeley

  76. Salton G, Wong A, Yang C (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620

    Article  MATH  Google Scholar 

  77. Sawyer P, Rayson P, Cosh K (2005) Shallow knowledge as an aid to deep understanding in early phase requirements engineering. IEEE Trans Softw Eng 31(11):969–981

    Article  Google Scholar 

  78. Slankas J, Williams L (2013) Automated extraction of non-functional requirements in available documentation. In: International workshop on natural language analysis in software engineering (NaturaLiSE), pp 9–16

  79. Slonim N, Tishby N (2000) Document clustering using word clusters via the information bottleneck method. In: International ACM SIGIR conference on research and development in information retrieval, pp 208–215

  80. Sousa D, Sarmento L, Rodrigues EM (2010) Characterization of the twitter replies network: are user ties social or topical? In: International workshop on search and mining user-generated contents, pp 63–70

  81. Sridhara G, Hill E, Pollock L, Vijay-Shanker K (2008) Identifying word relations in software: A comparative study of semantic similarity tools. In: IEEE international conference on program comprehension, pp 123–132

  82. Strube M, Ponzetto S (2006) Wikirelate! computing semantic relatedness using Wikipedia. In: National conference on artificial intelligence, pp 1419–1424

  83. Thelwall M (2008) Extracting accurate and complete results from search engines: case study windows live. J Am Soc Inform Sci Technol 59(1):38–50

    Article  Google Scholar 

  84. Turney P (2001) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: European conference on machine learning, pp 491–502

  85. Woon WL, Madnick S (2009) Asymmetric information distances for automated taxonomy construction. Knowl Inf Syst 21(1):91–111

    Article  Google Scholar 

  86. Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: Annual meeting on association for computational linguistics, pp 133–138

  87. Xiang Z, Wöber K, Fesenmaier D (2008) Representation of the online tourism domain in search engines. J Travel Res 47(2):137–150

    Article  Google Scholar 

  88. Zhang W, Yang Y, Wang Q, Shu F (2011) An empirical study on classification of non-functional requirements. In: International conference on software engineering and knowledge engineering, pp 190–195

Download references

Acknowledgments

The authors would like to thank our study participants and the Institutional Review Board (IRB) at LSU for approving this research. This work was supported in part by the Louisiana Board of Regents Research Competitiveness Subprogram (LA BoR-RCS), contract number: LEQSF(2015-18)-RD-A-07.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anas Mahmoud.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahmoud, A., Williams, G. Detecting, classifying, and tracing non-functional software requirements. Requirements Eng 21, 357–381 (2016). https://doi.org/10.1007/s00766-016-0252-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00766-016-0252-8

Keywords

Navigation