Planning quality assurance (QA) activities in a systematic way and controlling their execution are challenging tasks for companies that develop software or software-intensive systems. Both require estimation capabilities regarding the effectiveness of the applied QA techniques and the defect content of the checked artifacts. Existing approaches for these purposes need extensive measurement data from historical projects. Due to the fact that many companies do not collect enough data for applying these approaches (especially for the early project lifecycle), they typically base their QA planning and controlling solely on expert opinion. This article presents a hybrid method combining commonly available measurement data and context-specific expert knowledge. To evaluate the method’s applicability and usefulness, we conducted a case study in the context of independent verification and validation activities for critical software in the space domain. A hybrid defect content and effectiveness model was developed for the software requirements analysis phase and evaluated with available legacy data. One major result is that the hybrid model provides improved estimation accuracy when compared to applicable models based solely on data. The mean magnitude of relative error (MMRE) determined by cross-validation is 29.6% compared to 76.5% obtained by the most accurate data-based model.

Similar content being viewed by others
Allen DM (1974) The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16(1):125–127
Aurum A, Petersson H, Wohlin C (2002) State-of-the-art: software inspections after 25 years. Softw Test Verif Reliab 12(3):131–154
Bibi S, Tsoumakas G, Stamelos I, Vlahvas I (2006) Software defect prediction using regression via classification. Int Conf Comput Syst Appl, pp 330–336
Briand L, Freimunt B (2004) Using multiple adaptive regression splines to support decision making in code inspections. J Syst Softw
Briand L, El Emam K, Freimut B, Laitenberger O (1997) Quantitative evaluation of capture-recapture models to control software inspections. 8th Int Symp Softw Reliability Eng, pp 234–244
Briand L, El Emam K, and Bomarius F (1998) COBRA: a hybrid method for software cost estimation, benchmarking, and risk assessment. ISERN-97-24
Briand L, El Emam K, Freimut B, Laitenberger O (2000a) A comprehensive evaluation of capture-recapture models for estimating software defect content. IEEE Trans Softw Eng 26(6):518–540
Briand L, Wüst J, Daly JW, Porter V (2000b) Exploring the relationships between design measures and software quality in object-oriented systems. J Syst Softw 51:245–273
Conte SD, Dunsmore HE, Shen VY (1986) Software engineering metrics and models. Benjamin-Cummings, Menlo Park, CA
Cook TD, Campbell DT (1979) Quasi-experimentation: design and analysis issues for field settings. Mifflin, Boston
Eick SG, Loader CR, Long MD, Votta LG, Wiel SV (1992) Estimating software fault content before coding. 14th Int Conf Softw Eng, pp 59–65
El Emam K, Laitenberger O, Harbich T (2000) The application of subjective estimates of effectiveness to controlling software inspections. J Syst Softw USA 54(2):119–136
Endres A, Rombach D (2003) A handbook of software and systems engineering. Addison Wesley
Fenton N, Neil M (1999) A critique of software defect prediction models. IEEE Trans Softw Eng 25(5):676–689
Fishman GS (1995) Monte Carlo: concepts, algorithms, and applications. Springer Verlag, New York
Freimut B (2006) MAGIC A hybrid modeling approach for optimizing inspection cost-effectiveness. Fraunhofer-IRBVerlag, Stuttgart
Friedman J (1991) Multivariate adaptive regression splines. Ann Stat 19:1–141
Halstead MH (1977) Elements of software science. Elsevier, New York
Huang L, Boehm B (2005) Determining how much software assurance is enough? A value-based approach. In: International Symposium on Empirical Software Engineering, Noosa Heads, Qld., Australia, 17–18 Nov
IEEE (2005) Std. 1012-2004. IEEE standard for software verification and validation. IEEE Comput Soc
IESE Fraunhofer (2008) CoBRIX Tool. http://www.cobrix.org/cobrix/index.html. Accessed 1 May 2008
Jacobs J, van Moll J, Kusters R, Trienekens J, Brombacher A (2007) Identification of factors that influence defect injection and detection in development of software intensive products. Inf Softw Technol 49(7):774–789
Jones C (1996) Applied software measurement: assuring productivity and quality, 2nd edn. McGraw-Hill, New York
Juristo N, Moreno AM, Vegas S (2002) A survey on testing technique empirical studies: how limited is our knowledge? 1st Int Symp Empir Softw Eng, pp 161–172
Kan SH (2003) Metrics and models in software quality engineering, 2nd edn. Addison-Wesley, Boston
Kendall MG, Smith B (1939) The problem of m rankings. Ann Math Stat 3:275–287
Kitchenham BA, Pickard LM, MacDonell SG, Shepperd MJ (2001) What accuracy statistics really measure. IEEE Softw 148(3):81–85
Kläs M, Trendowicz A, Wickenkamp A, Münch J, Kikuchi N, Ishigai Y (2008) The use of simulation techniques for hybrid software cost estimation and risk analysis. In: Advances in computers, (74)115–174, Elsevier
Kohtake N, Katoh A, Ishihama N, Miyamoto Y, Kawasaki T, Katahira M (2008) Software independent verification and validation for spacecraft at JAXA. IEEE Aerosp Conf
McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng 2(4):308–320
McKay MD, Beckman RJ, Conover WJ (1979) A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2):239–245
Meyer MA, Booker JM (2001) Eliciting and analyzing expert judgment. A practical guide. [First publ. by Acad. Press Ltd, London, 1991]. Philadelphia, Pa: Society for Industrial and Applied Mathematics and American Statistical Association (ASA-SIAM series on statistics and applied probability, 7)
Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. 28th Int Conf Softw Eng, pp 452–461
Nakao H, Yoshikawa S, Port D, Miyamoto Y, Katahira M (2007) Comparing model generated with expert generated IV\&V activity plans. Proc 1st Int Symp Emp Softw Eng Meas: IEEE Comp Soc, pp 71–80
NIST (2002) Planning Report 02-3, The economic impacts of inadequate infrastructure for software quality
Petersson H, Thelin T, Runeson P, Wohlin C (2004) Capture-recapture in software inspections after 10 years research. Theory, evaluation and application. J Syst Softw 72(2):249–264
Ruhe M, Jeffery R, Wieczorek I (2003) Cost estimation for web applications. 25th Int Conf Softw Eng, pp 285–294
Sheskin DJ (2007) Handbook of parametric and nonparametric statistical procedures, 4th edn. Chapman & Hall/CRC, Boca Raton, Fla
Shull F, Basili V, Boehm B, Brown AW, Costa A, Lindvall M, Port D, Rus I, Tesoriero R, Zelkowitz M (2002) What we have learned about fighting defects. 8th Int Symp Softw Metr USA, pp 249–258
Trendowicz A, Heidrich J, Münch J, Ishigai Y, Yokoyama K, Kikuchi N (2006) Development of a hybrid cost estimation model in an iterative manner. 28th Int Conf Softw Eng, pp 331–340
Trendowicz A, Münch J, Jeffery R (2008) State of the practice in software effort estimation: a survey and literature review. Proceedings to the 3rd IFIP TC2 Central and East European Conference on Software Engineering Techniques, Brno, 13–15 October 2008. To appear in Springer LNCS, Springer Verlag, 2009
Vose D (1996) Quantitative risk analysis. a guide to Monte Carlo simulation modeling. Wiley, Chichester
Weller EF (1994) Using metrics to manage software projects. IEEE Comput J USA 27(9):27–33
Wohlin C, Runeson P (1998) Defect content estimations from review data. 20th Int Conf Softw Eng, pp 400–409
Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, Wesslen A (2000) Experimentation in software engineering an introduction. Kluwer, Boston, MA
We would like to thank the development project staff and the IV&V staff from the JAXA Engineering Digital Innovation Center (JEDI) at the Japanese Aerospace Exploration Agency (JAXA), where we conducted the case study to construct the hybrid prediction model. We would like to thank the staff of JAMSS, who greatly contributed by answering the questionnaires and giving us historical experience data. Finally, we would like to thank Adam Trendowicz and Marcus Ciolkowski from Fraunhofer IESE for the initial review of the paper, Sonnhild Namingha for proofreading, and the anonymous reviewers of the International Symposium on Software Reliability Engineering and the Journal of Empirical Software Engineering for their valuable feedback. Parts of this work have been funded by the BMBF SE2006 project TestBalance (grant 01 IS F08 D).
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Laurie Williams
Rights and permissions
About this article
Cite this article
Kläs, M., Nakao, H., Elberzhager, F. et al. Support planning and controlling of early quality assurance by combining expert judgment and defect data—a case study. Empir Software Eng 15, 423–454 (2010). https://doi.org/10.1007/s10664-009-9112-1
Issue Date:
DOI: https://doi.org/10.1007/s10664-009-9112-1