Abstract
Rationale refers to the reasoning and justification behind human decisions, opinions, and beliefs. In software engineering, rationale management focuses on capturing design and requirements decisions and on organizing and reusing project knowledge. This paper takes a different view on rationale written by users in online reviews. We studied 32,414 reviews for 52 software applications in the Amazon Store. Through a grounded theory approach and peer content analysis, we investigated how users argue and justify their decisions, e.g., about upgrading, installing, or switching software applications. We also studied the occurrence frequency of rationale concepts such as issues encountered or alternatives considered in the reviews and found that assessment criteria like performance, compatibility, and usability represent the most pervasive concept. We identified a moderate positive correlation between issues and criteria and furthermore assessed the distribution of rationale concepts with respect to rating and verbosity. We found that issues tend to appear more in lower star rated reviews, while criteria, alternatives, and justifications seem to appear more in three star rated reviews. Also, reviews reporting alternatives seem to be more verbose than reviews reporting criteria. A follow-up qualitative study of sub-concepts revealed, that users also report other alternatives (e.g., alternative software provider), criteria (e.g., cost), and decisions (e.g., on rating software). We then used the truth set of manually labeled review sentences to explore how accurately we can mine rationale concepts from the reviews. We evaluated the classification algorithms Naive Bayes, Support Vector Machine, Logistic Regression, Decision Tree, Gaussian Process, Random Forest, and Multilayer Perceptron Classifier using a baseline and random configuration. Support Vector Classifier, Naive Bayes, and Logistic Regression, trained on the review metadata, syntax tree of the review text, and influential terms, achieved a precision around 80% for predicting sentences with alternatives and decisions, with top recall values of 98%. On the review level, precision was up to 13% higher with recall values reaching 99%. Using only word features, we achieved in most cases the highest precision and highest recall respectively using the Random Forest and Naive Bayes algorithm. We discuss the findings and the rationale importance for supporting deliberation in user communities and synthesizing the reviews for developers.






Similar content being viewed by others
Notes
https://github.com/aesuli/Amazon-downloader, accessed on March 2016.
http://www.nltk.org/, accessed June 2017.
http://nlp.stanford.edu/software/lex-parser.shtml, accessed June 2017.
http://scikit-learn.org/, accessed March 2017.
References
Alkadhi R, Laţa T, Guzman E, Bruegge B (2017) Rationale in development chat messages: an exploratory study. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 436–446
Benesty J, Chen J, Huang Y, Cohen I (2009) Pearson correlation coefficient. In: Noise reduction in speech processing. Springer, pp 1–4
Boltuzic F, Snajder J (2014) Back up your stance: recognizing arguments in online discussions. In: ArgMining@ ACL, pp 49–58
Bruegge B, Dutoit AA (1999) Object-oriented software engineering; conquering complex and changing systems. Prentice Hall, London
Burge JE, Carroll JM, McCall R, Mistrk I (2008) Rationale-based software engineering. Springer, Berlin
Carre no LVG, Winbladh K (2013) Analysis of user comments: an approach for software requirements evolution. In: 2013 35th international conference on software engineering (ICSE). IEEE, pp 582–591
Charrada EB (2016) Which one to read? Factors influencing the usefulness of online reviews for re. In: 2016 IEEE 24th international requirements engineering conference workshops (REW), pp 46–52
Chen H, Zimbra D (2010) Ai and opinion mining. IEEE Intell Syst 25(3):74–80
Cohen J (1968) Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull 70(4):213
de Rosis F, Novielli N (2007) From language to thought: inferring opinions and beliefs from verbal behavior. Proc AISB 7:377–384
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923
Dutoit AH, McCall R, Mistrik I, Paech B (2006) Rationale management in software engineering. Springer, Berlin
Dutoit AH, Paech B (2002) Rationale-based use case specification. Requir Eng 7(1):3–19
Ebrahimi J, Dou D, Lowd D (2016) A joint sentiment-target-stance model for stance classification in tweets. In: COLING, pp 2656–2665
Grady RB (1992) Practical software metrics for project management and process improvement. Prentice-Hall, Inc., London
Greenwood PE, Nikulin MS (1996) A guide to chi-squared testing, vol 280. Wiley, Hoboken
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Guzman E, Maalej W (2014) How do users like this feature? A fine grained sentiment analysis of app reviews. In: IEEE 22nd international requirements engineering conference, RE 2014, Karlskrona, Sweden, 25–29 August 2014, pp 153–162
Habernal I, Gurevych I (2017) Argumentation mining in user-generated web discourse. Comput Linguist 43:125–179
Hand DJ, Yu K (2001) Idiot’s Bayes—not so stupid after all? Int Stat Rev 69(3):385–398
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Iacob C, Harrison R (2013) Retrieving and analyzing mobile apps feature requests from online reviews. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13. IEEE Press
Jarczyk A, Loffler P, Shipman F (1992) Design rationale for software engineering: a survey. In: Proceedings of the twenty-fifth Hawaii international conference on system sciences
Jongeling R, Datta S, Serebrenik A (2015) Choosing your weapons: on sentiment analysis tools for software engineering research. In: 015 IEEE international conference on Software maintenance and evolution (ICSME), vol 2. IEEE, pp 531–535
Kohavi R et al (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, vol 14
Kurtanović Z, Maalej W (2017) Automatically classifying functional and non-functional requirements using supervised machine learning. In: 2017 IEEE 25th international requirements engineering conference (RE). IEEE, pp 490–495
Kurtanović Z, Maalej W (2017) Mining user rationale from software reviews. In: Proceedings of the 25rd IEEE international requirements engineering conference. IEEE
Lee J (1997) Design rationale systems: understanding the issues. IEEE Expert Intell Syst Their Appl 12(3):78–85
Liang Y, Liu Y, Kwong CK, Lee WB (2012) Learning the “why”: discovering design rationale using text mining—an algorithm perspective. Comput Aided Des 44(10):916–930
Lippi M, Torroni P (2016) Argumentation mining: state of the art and emerging trends. ACM Trans Internet Technol 16:10
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Liu Y, Liang Y, Kwong CK, Lee WB (2010) A new design rationale representation model for rationale mining. J Comput Inf Sci Eng 10(3):031009
Loosen W, Häring M, Kurtanović Z, Merten L, Reimer J, van Roessel L, Maalej W (2017) Making sense of user comments. Identifying journalists’ requirements for a software framework. In: Preconference of the 67th annual conference of the International Communication Association (ICA). IEEE
López C, Codocedo V, Astudillo H, Cysneiros LM (2012) Bridging the gap between software architecture rationale formalisms and actual architecture documents: an ontology-driven approach. Sci Comput Program 77(1):66–80
Maalej W, Kurtanović Z, Nabil H, Stanik C (2016) On the automatic classification of app reviews. Requir Eng 21:311–331
Maalej W, Nayebi M, Johann T, Ruhe G (2016) Toward data-driven requirements engineering. IEEE Softw 33(1):48–54
Maalej W, Robillard MP (2013) Patterns of knowledge in API reference documentation. IEEE Trans Softw Eng 39(9):1264–1282
Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19(2):313–330
McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2):153–157
Merriam-Webster. Merriam-webster. https://www.merriam-webster.com/. Last Accessed: Jan 2018
Neuendorf KA (2001) The content analysis guidebook, 1st edn. Sage Publications, Inc., Thousand Oaks
Nguyen H, Litman D (2015) Extracting argument and domain words for identifying argument components in texts. In: Proceedings of the 2nd workshop on argumentation mining, pp 22–28
Pagano D, Brügge B (2013) User involvement in software evolution practice: a case study. In: 35th international conference on software engineering, ICSE ’13, pp 953–962
Pagano D, Maalej W (2013) User feedback in the appstore: an empirical study. In: RE. IEEE Computer Society, pp 125–134
Palau RM, Moens M-F (2009) Argumentation mining: the detection, classification and structure of arguments in text. In: Proceedings of the 12th international conference on artificial intelligence and law, ICAIL ’09. ACM, pp 98–107
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135
Peldszus A, Stede M (2013) From argument diagrams to argumentation mining in texts: a survey. Int J Cogn Inform Nat Intell (IJCINI) 7(1):1–31
Ramos J et al (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning, vol 242, pp 133–142
Rogers B, Gung J, Qiao Y, Burge J (2012) Exploring techniques for rationale extraction from existing documents. In: 2012 34th international conference on software engineering (ICSE)
Rogers B, Justice C, Mathur T, Burge JE (2017) Generalizability of document features for identifying rationale. In: Design computing and Cognition’16. Springer, pp 633–651
Rogers B, Qiao Y, Gung J, Mathur T, Burge JE (2015) Using text mining techniques to extract rationale from existing documentation. In: Gero SJ, Hanna S (eds) Design computing and cognition ’14. Springer, Berlin, pp 457–474
Schneider J, Samp K, Passant A, Decker S (2013) Arguments about deletion: how experience improves the acceptability of arguments in ad-hoc online task groups. In: Proceedings of the 2013 conference on Computer supported cooperative work. ACM, pp 1069–1080
Sedano T, Ralph P, Péraire C (2017) Lessons learned from an extended participant observation grounded theory study. In: Proceedings of the 5th international workshop on conducting empirical studies in industry. IEEE Press, pp 9–15
Stol K-J, Ralph P, Fitzgerald B (2016) Grounded theory in software engineering research: a critical review and guidelines. In: 2016 IEEE/ACM 38th international conference on software engineering (ICSE). IEEE, pp 120–131
Strauss A, Corbin J (1998) Basics of qualitative research: techniques and procedures for developing grounded theory. SAGE, Thousand Oaks
Teufel S, Moens M (2002) Summarizing scientific articles: experiments with relevance and rhetorical status. Comput Linguist 28(4):409–445
Walker MA, Anand P, Abbott R, Tree JEF, Martell C, King J (2012) That is your evidence?: Classifying stance in online political debate. Decis Support Syst 53(4):719–729
Wiebe J, Riloff E (2005) Creating subjective and objective sentence classifiers from unannotated texts. In: Computational linguistics and intelligent text processings of the 6th international conference, CICLing
Willemsen LM, Neijens PC, Bronner F, De Ridder JA (2011) Highly recommended! The content characteristics and perceived usefulness of online consumer reviews. J Comput Mediat Commun 17(1):19–38
Wyner A, Schneider J, Atkinson K, Bench-Capon TJM (2012) Semi-automated argumentative analysis of online product reviews. In: Verheij B, Szeider S, Woltran S (eds) COMMA, volume 245 of frontiers in artificial intelligence and applications. IOS Press
Acknowledgements
The authors thank the coders, particularly A. Alizadeh, J. Hennings, E. Kurtanović, D. Martens, and M. Ziaei for their help with the coding. This work is partly funded by the H2020 EU research project OPENREQ (ID 732463).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kurtanović, Z., Maalej, W. On user rationale in software engineering. Requirements Eng 23, 357–379 (2018). https://doi.org/10.1007/s00766-018-0293-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00766-018-0293-2