Using information retrieval based coupling measures for impact analysis | Empirical Software Engineering Skip to main content
Log in

Using information retrieval based coupling measures for impact analysis

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Coupling is an important property of software systems, which directly impacts program comprehension. In addition, the strength of coupling measured between modules in software is often used as a predictor of external software quality attributes such as changeability, ripple effects of changes and fault-proneness. This paper presents a new set of coupling measures for Object-Oriented (OO) software systems measuring conceptual coupling of classes. Conceptual coupling is based on measuring the degree to which the identifiers and comments from different classes relate to each other. This type of relationship, called conceptual coupling, is measured through the use of Information Retrieval (IR) techniques. The proposed measures are different from existing coupling measures and they capture new dimensions of coupling, which are not captured by the existing coupling measures. The paper investigates the use of the conceptual coupling measures during change impact analysis. The paper reports the findings of a case study in the source code of the Mozilla web browser, where the conceptual coupling metrics were compared to nine existing structural coupling metrics and proved to be better predictors for classes impacted by changes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Mozilla is a web browser and is available at http://www.mozilla.org/ (verified 27/06/08)

  2. WinMerge is a visual text file differencing and merging tool for Windows and can be found at http://sourceforge.net/projects/winmerge (verified at 27/06/08)

  3. TortoiseCVS is a concurrent versions system (CVS) tool for Windows and can be found at http://sourceforge.net/projects/tortoisecvs(verified at 27/06/08)

  4. The bug can be accessed in Bugzilla at https://bugzilla.mozilla.org/show_bug.cgi?id=232570 (verified at 27/06/08)

  5. Bugzilla is a web-based general-purpose bug-tracking tool originally developed and used by the Mozilla project, and licensed under the Mozilla Public License. Bugzilla can be fount at http://bugzilla.mozilla.org/ (verified at 27/06/08)

  6. The bug can be accessed in Bugzilla at https://bugzilla.mozilla.org/show_bug.cgi?id=226439 (verified at 27/06/08)

References

  • Abreu F, Pereira G, Sousa P (2000) A Coupling-Guided Cluster Analysis Approach to Reengineer the Modularity of Object-Oriented Systems. Conference on Software Maintenance and Reengineering (CSMR'00). IEEE Computer Society, Zurich Switzerland, pp 13–22

    Google Scholar 

  • Allen EB, Khoshgoftaar TM, Chen Y (2001) Measuring coupling and cohesion of software modules: an information-theory approach. 7th International Software Metrics Symposium (METRICS'01), 124–134.

  • Antoniol G, Fiutem R, Cristoforetti L (1998) Using Metrics to Identify Design Patterns in Object-Oriented Software. 5th IEEE International Symposium on Software Metrics (METRICS'98), Bethesda, MD, 23–34.

  • Antoniol G, Canfora G, Casazza G, Lucia A (2000) Identifying the Starting Impact Set of a Maintenance Request: A Case Study. 4th European Conference on Software Maintenance and Reengineering (CSMR2000), Zurich, Switzerland, 227–231.

  • Antoniol G, Canfora G, Casazza G, De Lucia A, Merlo E (2002) Recovering traceability links between code and documentation. IEEE Trans Softw Eng 28(10):970–983

    Article  Google Scholar 

  • Antoniol G, Gueheneuc Y-G, Merlo E, Tonella P (2007) Mining the Lexicon Used by Programmers during Software Evolution. 23rd IEEE International Conference on Software Maintenance (ICSM'07). IEEE Computer Society, Paris, France, pp 14–23

    Google Scholar 

  • Arisholm E, Briand LC, Foyen A (2004) Dynamic coupling measurement for object-oriented software. IEEE Trans Softw Eng 30(8):491–506

    Article  Google Scholar 

  • Bohner S (1996) Impact analysis in the software change process: A year 2000 perspective. International Conference on Software Maintenance (ICSM '96), Monterey, CA, 42–51

  • Bohner SA, Gracanin D (2003) Software impact analysis in a virtual environment. Software Engineering Workshop, 143–151.

  • Briand L, Wüst J, Louinis H (1999a) Using Coupling Measurement for Impact Analysis in Object-Oriented Systems. IEEE International Conference on Software Maintenance (ICSM'99), IEEE Computer Society Press, 475–482

  • Briand LC, Devanbu P, Melo WL (1997) An investigation into coupling measures for C++. International Conference on Software engineering (ICSE'97). ACM, Boston, pp 412–421

    Google Scholar 

  • Briand LC, Daly JW, Porter V, Wüst J (1998) A Comprehensive Empirical Validation of Design Measures for Object-Oriented Systems. 5th International Software Metrics Symposium (METRICS'98), Bethesda, MD, IEEE Computer Science, 43–53

  • Briand LC, Daly J, Wüst J (1999b) A unified framework for coupling measurement in object oriented systems. IEEE Trans Softw Eng 25(1):91–121

    Article  Google Scholar 

  • Briand LC, Wüst J, Daly JW, Porter VD (2000) Exploring the relationship between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245–273

    Article  Google Scholar 

  • Canfora G, Cerulo L (2005) Impact Analysis by Mining Software and Change Request Repositories. 11th IEEE International Symposium on Software Metrics (METRICS'05), 20–29.

  • Chen K, Rajlich V (2000) Case Study of Feature Location Using Dependence Graph. 8th IEEE International Workshop on Program Comprehension (IWPC'00), Limerick, Ireland, 241–249.

  • Chidamber SR, Kemerer CF (1991) Towards a Metrics Suite for Object Oriented Design. Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA'91), SIGPLAN Notices, 197–211.

  • Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493

    Article  Google Scholar 

  • Clelang-Huang J, Settimi R, Duan C, Zou X (2005) Utilizing Supporting Evidence to Improve Dynamic Requirements Traceability. International Requirements Engineering Conference (RE'05), Paris, France, 135–144.

  • Cubranic D, Murphy GC, Singer J, Booth KS (2005) Hipikat: a project memory for software development. IEEE Trans Softw Eng 31(6):446–465

    Article  Google Scholar 

  • De Lucia A, Fasano F, Oliveto R, Tortora G (2007) Recovering Traceability Links in Software Artefact Management Systems. ACM Transactions on Software Engineering and Methodology 16(4).

  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by Latent Semantic Analysis. J Am Soc Inf Sci 41:391–407

    Article  Google Scholar 

  • Di Lucca GA, Di Penta M, Gradara S (2002) An Approach to Classify Software Maintenance Requests. IEEE International Conference on Software Maintenance (ICSM'02), Montréal, Québec, Canada, 93–102.

  • Eaddy M, Aho AV, Antoniol G, Guéhéneuc YG (2008) CERBERUS: Tracing Requirements to Source Code Using Information Retrieval, Dynamic Analysis, and Program Analysis. 17th IEEE International Conference on Program Comprehension (ICPC'08), Amsterdam, The Netherlands.

  • El-Emam K, Melo K (1999) The Prediction of Faulty Classes Using Object-Oriented Design Metrics. NRC/ERB-1064 NRC 43609.

  • Etzkorn L, Delugach H (2000) Towards a semantic metrics suite for object-oriented design. 34th International Conference on Technology of Object-Oriented Languages and Systems, 71–80.

  • Etzkorn LH, Davis CG (1997) Automatically identifying reusable OO legacy code. IEEE Computer 30(10):66–72

    Google Scholar 

  • Etzkorn LH, Gholston S, Hughes WE (2002) A semantic entropy metric. Journal of Software Maintenance: Research and Practice 14(5):293–310

    MATH  Google Scholar 

  • Ferenc R, Beszédes Á, Tarkiainen M, Gyimóthy T (2002) Columbus—Reverse Engineering Tool and Schema for C++. 18th IEEE International Conference on Software Maintenance (ICSM'02), Montréal, Canada, 172–181.

  • Fischer B (1998) Specification-Based Browsing of Software Component Libraries. 13th ASE, 74–83.

  • Flyvbjerg B (2006) Five misunderstandings about case-study research. Qual Inq 12(2):219–245

    Article  Google Scholar 

  • Gall H, Jazayeri M, Krajewski J (2003) CVS Release History Data for Detecting Logical Couplings. Sixth International Workshop on Principles of Software Evolution (IWPSE'03):13–23.

  • Gyimóthy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910

    Article  Google Scholar 

  • Hassoun Y, Johnson R, Counsell S (2004) A Dynamic Runtime Coupling Metric for Meta-Level Architectures. 8th IEEE European Conference on Software Maintenance and Reengineering (CSMR'04), 339–346

  • Hayes JH, Dekhtyar A, Sundaram SK (2006) Advancing candidate link generation for requirements tracing: the study of methods. IEEE Trans Softw Eng 32(1):4–19

    Article  Google Scholar 

  • Helm R, Maarek YS (1991) Integrating information retrieval and domain specific approaches for browsing and retrieval in object-oriented class libraries. Conference proceedings on Object-oriented programming systems, languages, and applications. Phoenix, Arizona, United States ACM, New York, 47–61

  • Hill E, Pollock L, Vijay-Shanker K (2007) Exploring the Neighborhood with Dora to Expedite Software Maintenance. 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE'07), 14–23

  • Jolliffe IT (1986) Principal Component Analysis. Springer, New York

    Google Scholar 

  • Kawaguchi S, Garg PK, Matsushita M, Inoue K (2006) MUDABlue: an automatic categorization system for open source repositories. J Syst Softw 79(7):939–953

    Article  Google Scholar 

  • Kosara R, Healey CG, Interrante V, Laidlaw DH, Ware C (2003) Visualization viewpoints. Comput Graph and Appl 23(4):20–25

    Article  Google Scholar 

  • Kramer S, Kaindl H (2004) Coupling and cohesion metrics for knowledge-based systems using frames and rules. ACM Trans Softw Eng Methodol 13(3):332–358

    Article  Google Scholar 

  • Kuhn A, Ducasse S, Gîrba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49(3):230–243

    Article  Google Scholar 

  • Lawrie DJ, Feild H, Binkley D (2006) Leveraged Quality Assessment using Information Retrieval Techniques. 14th IEEE International Conference on Program Comprehension (ICPC'06), 149–158.

  • Lee JK, Jung SJ, Kim SD, Jang WH, Ham DH (2001) Component identification method with coupling and cohesion. Eighth Asia-Pacific Software Engineering Conference (APSEC'01), 79–86.

  • Lee YS, Liang BS, Wu SF, Wang FJ (1995) Measuring the Coupling and Cohesion of an Object-Oriented Program Based on Information Flow. International Conference on Software Quality, Maribor, Slovenia.

  • Li W, Henry S (1993) Object-oriented metrics that predict maintainability. J Syst Softw 23(2):111–122

    Article  Google Scholar 

  • Linstead E, Rigor P, Bajracharya S, Lopes C, Baldi P (2007) Mining Eclipse Developer Contributions via Author-Topic Models. 4th IEEE International Workshop on Mining Software Repositories (MSR'07), Minneapolis, MN, 30–33

  • Lo KK, Chan MK, Baniassad E (2006) Isolating and Relating Concerns in Requirements using Latent Semantic Analysis. ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA'06), 383–396

  • Lormans M, Van Deursen A (2006) Can LSI help Reconstructing Requirements Traceability in Design and Test? 10th European Conference on Software Maintenance and Reengineering (CSMR'06), 47–56.

  • Maarek YS, Berry DM, Kaiser GE (1991) An information retrieval approach for automatically constructing software libraries. IEEE Trans Softw Eng 17(8):800–813

    Article  Google Scholar 

  • Maletic JI, Marcus A (2001) Supporting Program Comprehension Using Semantic and Structural Information. 23rd International Conference on Software Engineering (ICSE'01), Toronto, Ontario, Canada, IEEE, 103–112.

  • Marcus A, Maletic JI (2001) Identification of High-Level Concept Clones in Source Code. Automated Software Engineering (ASE'01), San Diego, CA, 107–114.

  • Marcus A, Poshyvanyk D (2005) The Conceptual Cohesion of Classes. 21st IEEE International Conference on Software Maintenance (ICSM'05), Budapest, Hungary, 133–142.

  • Marcus A, Sergeyev A, Rajlich V, Maletic J (2004) An Information Retrieval Approach to Concept Location in Source Code. 11th IEEE Working Conference on Reverse Engineering (WCRE'04), Delft, The Netherlands, 214–223.

  • Marcus A, Maletic JI, Sergeyev A (2005a) Recovery of traceability links between software documentation and source code. Int J Softw Eng Knowl Eng 15(4):811–836

    Article  Google Scholar 

  • Marcus A, Rajlich V, Buchta J, Petrenko M, Sergeyev A (2005b) Static Techniques for Concept Location in Object-Oriented Code. 13th IEEE International Workshop on Program Comprehension (IWPC'05), 33–42.

  • Marcus A, Poshyvanyk D, Ferenc R (2008) Using the conceptual cohesion of classes for fault prediction in object oriented systems. IEEE Trans Softw Eng 34(2):287–300

    Article  Google Scholar 

  • Michail A, Notkin D (1999) Assessing software libraries by browsing similar classes, functions and relationships. IEEE International Conference on Software Engineering (ICSE'99), 463–472.

  • Mitchell A, Power JF (2006) A study of the influence of coverage on the relationship between static and dynamic coupling metrics. Sci Comput Program 59:4–25

    Article  MATH  MathSciNet  Google Scholar 

  • Mockus A, Votta LG (2000) Identifying reasons for software changes using historic databases. IEEE International Conference on Software Maintenance (ICSM'00), 120–130.

  • Offutt J, Harrold MJ, Kolte P (1993) A software Metric System for module coupling. J Syst Softw 20(3):295–308

    Article  Google Scholar 

  • Olague H, Etzkorn L, Gholston S, Quattlebaum S (2007) Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans Softw Eng 33(6):402–419

    Article  Google Scholar 

  • Orme AM, Yao H, Etzkorn LH (2006) Coupling metrics for ontology-based systems. IEEE Softw 23:102–108

    Article  Google Scholar 

  • Orso A, Apiwattanapong T, Law J, Rothermel G, Harrold MJ (2004) An empirical comparison of dynamic impact analysis algorithms. IEEE/ACM International Conference on Software Engineering (ICSE'04), 776–786.

  • Pan Y, Wang L, Zhang L, Xie B, Yang F (2004) Relevancy based semantic interoperation of reuse repositories. 12th ACM SIGSOFT 12th International Symposium on Foundations of Software Engineering (FSE12), Newport Beach, CA, 211–220.

  • Patel S, Chu W, Baxter R (1992) A Measure For Composite Module Cohesion. International Conference on Software Engineering (ICSE'92), 38–48.

  • Poshyvanyk D, Marcus A (2006) The Conceptual Coupling Metrics for Object-Oriented Systems. 22nd IEEE International Conference on Software Maintenance (ICSM'06), Philadelphia, PA, 469–478.

  • Poshyvanyk D, Marcus D (2007) Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code. 15th IEEE International Conference on Program Comprehension (ICPC'07), Banff, Alberta, Canada, 37–48.

  • Poshyvanyk D, Guéhéneuc YG, Marcus A, Antoniol G, Rajlich V (2007) Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans Softw Eng 33(6):420–432

    Article  Google Scholar 

  • Queille J-P, Voidrot J-F, Wilde N, Munro M (1994) The Impact Analysis Task in Software Maintenance: A Model and a Case Study. International Conference on Software Maintenance, 234–242.

  • Robillard M (2005) Automatic Generation of Suggestions for Program Investigation. Joint European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering, Lisbon, Portugal, 11–20

  • Rountev A, Milanova A, Ryder BG (2001) Points-to analysis for Java using annotated constraints. Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'01), Tampa Bay, FL, USA, 43–55.

  • Runeson P, Alexandersson M, Nyholm O (2007) Detection of Duplicate Defect Reports Using Natural Language Processing. 29th IEEE/ACM International Conference on Software Engineering (ICSE'07), Minneapolis, MN, 499–510.

  • Salton G, McGill M (1983) Introduction to Modern Information Retrieval, McGraw-Hill.

  • Siegel S, Castellan NJ (1988) Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill, New York

    Google Scholar 

  • Stein C, Etzkorn LH, Cox GW, Farrington PA, Gholston S, Utley DR, Fortune J (2004) A New Suite of Metrics for Object-Oriented Software. Software Audit and Metrics, 49–58.

  • Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An Approach to Detecting Duplicate Bug Reports using Natural Language and Execution Information. 30th International Conference on Software Engineering (ICSE’08), Leipzig, Germany, 461–470.

  • Wilkie FG, Kitchenham BA (2000) Coupling measures and change ripples in C++ application software. J Syst Softw 52:157–164

    Article  Google Scholar 

  • Yang HY, Tempero E, Berrigan R (2005) Detecting indirect coupling. Australian Software Engineering Conference, 212–221.

  • Ye Y, Fischer G (2005) Reuse-conducive development environments. Journal Automated Software Engineering 12(2):199–235

    Article  Google Scholar 

  • Yin RK (2003) Applications of Case Study Research. Sage Publications Inc., CA, USA

    Google Scholar 

  • Yu P, Systa T, Muller H (2002) Predicting fault-proneness using OO metrics. An industrial case study. 6th European Conference on Software Maintenance and Reengineering (CSMR'02), 99–107

  • Yu Z, Vaclav R (2001) Hidden Dependencies in Program Comprehension and Change Propagation. Ninth International Workshop on Program Comprehension (IWPC'01), Toronto, Canada, 293–299.

  • Zhao J (2004) Measuring Coupling in Aspect-Oriented Systems. 10th IEEE International Software Metrics Symposium (METRICS'04), Chicago, USA.

  • Zhao W, Zhang L, Liu Y, Sun J, Yang F (2006) SNIAFL: towards a static non-interactive approach to feature location. ACM Trans Softw Eng Methodol (TOSEM) 15(2):195–226

    Article  Google Scholar 

  • Zou L, Godfrey MW, Hassan AE (2007) Detecting Interaction Coupling from Task Interaction Histories. 15th IEEE International Conference on Program Comprehension (ICPC'07), Banff, Alberta, Canada, 135–144

Download references

Acknowledgements

This research was supported in part by grants from the U.S. National Science Foundation (CCF-0438970 and CCF-0820133), by the Hungarian national grants GVOP-3.3.1.-2004-04-0024/3.0 and GVOP-3.1.1.-2004-05-0345/3.0 and by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences. We would like to thank the anonymous reviewers for their pertinent and helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrian Marcus.

Additional information

Guest Editors: Tim Menzies and Letha Etzkorn

Denys Poshyvanyk performed this work while at Wayne State University.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Poshyvanyk, D., Marcus, A., Ferenc, R. et al. Using information retrieval based coupling measures for impact analysis. Empir Software Eng 14, 5–32 (2009). https://doi.org/10.1007/s10664-008-9088-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-008-9088-2

Keywords

Navigation