Abstract
Decision-making styles have been studied in non-situational settings using the classical survey instrument. This study proposes a novel methodology for identifying decision-making styles in a real-world purchasing situation using only behavioral data and machine learning. We base our analysis on a two-week sample of 1,347,854 clickstream sessions from an e-commerce company and extract a series of parameters to infer the search goal, strategy, and decision difficulty. We implement a range of unsupervised algorithms, and we identify and validate three internally stable classes of decision-makers. One category corresponds to the classical style of satisficers; the other two subcategorize the maximisers' classical style. The customer’s entry channel preferences and movement patterns provide compelling support for the style's predictive validity. This study contributes to research and practice by proposing a new methodology to recognize the customer decision style in the e-commerce setting.
Similar content being viewed by others
Notes
There are other approaches to understanding decision-making styles in the consumer behavior literature. For example, Consumer Types Inventory (CSI), created by Sprotles & Kendall (1986), identifies eight possible styles that influence behavior. However, many of these approaches lack generality. CSI was developed in the 1980s, when Internet shopping was non-existent, to assess attitudes and behaviors related to buying personal items.
Because clickstream data contain visitors at different stages of their decision-making process, within a given time frame, clickstream data include (i) complete visits (i.e., visitors who started the search and purchased during the website's observation period); (ii) visitors who momentarily explore the alternatives or are close to the decision, but their purchase occurs outside the period of observation (right-censored); (iii) visitors who momentarily purchase, but their search process commenced before the period of observation (left-censored); (iv) visitors who only explore the online environment without the intention to purchase or to purchase offline; and (v) shallow or unrelated visits (Moe, 2003; Schellong et al., 2016). In the first part of the study, we focus on complete visits to reliably determine the decision-makers' online behavior and recognize their distinguishable characteristics. In the cluster validation process, we include a sample of visitors who momentarily explore the alternatives or are close to the decision, but their purchase occurs outside the observation period (right-censored).
Unsupervised learning is a type of machine learning in which models are trained using an unlabeled dataset and can act on that data without any supervision. This technique is appropriate in our study since we do not have prior information about the customers' class. In such a case, no labels are given to the learning algorithm, leaving it on its own to find structure in its input.
The Cophenetic coefficient is the correlation between the original distance matrix between objects and the Cophenetic distance matrix based on dendrograms. The intergroup dissimilarity at which two clustered observations are initially integrated into a single cluster is described as the Cophenetic distance between two clustered observations. The clustering is well-fitting when the Cophenetic coefficient is close to 1.
Connectivity measures the extent to which neighboring observations are clustered together. The Dunn index is the ratio of the shortest distance between observations in distinct clusters to the largest cluster diameter. Silhouette considers how near the different clusters are to one another (inter-cluster separation) as well as the magnitude of the intra-cluster variations (i.e., compactness). Connectivity with a value ranging between zero and ∞ should be minimized. Dunn (values between 0 and ∞) and Silhouette (values ranging between -1 and 1) should be maximized (Brock et al., 2008).
The adjusted Rand index measures how similar two market segmentation solutions are while correcting for agreement by chance. The adjusted Rand index is 1 if two market segmentation solutions are identical, and 0 if the agreement between the two market segmentation solutions is the same as expected by chance (Dolnicar et al., 2018).
The Jaccard index has the same interpretation as the adjusted Rand index.
Gower distance (Gower, 1971) calculates the distance between records that contain combinations of logical, numerical, categorical, or text data.
SAHN often recommends a 2-cluster solution as the optimal solution because the value of any stopping rule in a tree-like structure algorithm is typically highest when moving from two to one cluster (Hair et al., 2006).
References
Anderl, E., Schumann, J. H., & Kunz, W. (2016). Helping firms reduce complexity in multichannel online data: A new taxonomy-based approach for customer journeys. Journal of Retailing, 92(2), 185–203. https://doi.org/10.1016/j.jretai.2015.10.001
Anderson, C. J. (2003). The psychology of doing nothing: Forms of decision avoidance result from reason and emotion. Psychological Bulletin, 129(1), 139. https://doi.org/10.1037/0033-2909.129.1.139
Ariely, D. (2000). Controlling the information flow: Effects on consumers’ decision making and preferences. Journal of Consumer Research, 27(2), 233–248. https://doi.org/10.1086/314322
Arunachalam, D., & Kumar, N. (2018). Benefit-based consumer segmentation and performance evaluation of clustering approaches: An evidence of data-driven decision-making. Expert Systems with Applications, 111, 11–34. https://doi.org/10.1016/j.eswa.2018.03.007
Basu, S. (2018). Information search in the internet markets: Experience versus search goods. Electronic Commerce Research and Applications 30(July-August) 25–37. https://doi.org/10.1016/j.elerap.2018.05.004
Blattberg, R. C., Kim, B., & Neslin, S. (2008). Database Marketing. Vol. 18 of International Series in Quantitative Marketing. In: Springer, New York, USA.
Brannon, D. C., & Soltwisch, B. W. (2017). If it has lots of bells and whistles, it must be the best: How maximizers and satisficers evaluate feature-rich versus feature-poor products. Marketing Letters, 28(4), 651–662. https://doi.org/10.1007/s11002-017-9440-7
Brock, G., Pihur, V., Datta, S., & Datta, S. (2008). clValid: An R Package. Journal of Statistical Software, 25(4), 1–22. https://doi.org/10.18637/jss.v025.i04
Broniarczyk, S. M., & Griffin, J. G. (2014). Decision difficulty in the age of consumer empowerment. Journal of Consumer Psychology, 24(4), 608–625. https://doi.org/10.1016/j.jcps.2014.05.003
Buchta, C., Hahsler, M., Diaz, D., Buchta, M. C., & Zaki, M. J. (2020). arulesSequences: An R Package. CRAN Repository. https://cran.r-project.org/web/packages/arulesSequences/arulesSequences.pdf
Bucklin, R. E., & Sismeiro, C. (2009). Click here for Internet insight: Advances in clickstream data analysis in marketing. Journal of Interactive Marketing, 23(1), 35–48. https://doi.org/10.1016/j.intmar.2008.10.004
Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. Erlbaum.
Carrillat, F. A., Ladik, D. M., & Legoux, R. (2011). When the decision ball keeps rolling: An investigation of the Sisyphus effect among maximizing consumers. Marketing Letters, 22(3), 283–296. https://doi.org/10.1007/s11002-010-9125-y
Chabris, C. F., Laibson, D., Morris, C. L., Schuldt, J. P., & Taubinsky, D. (2008). Measuring intertemporal preferences using response times. Retrieved from https://www.nber.org/papers/w14353
Cheng, J., & González-Vallejo, C. (2018). Unpacking decision difficulty: Testing action dynamics in Intertemporal, gamble, and consumer choices. Acta Psychologica, 190, 199–216. https://doi.org/10.1016/j.actpsy.2018.08.002
Chowdhury, T. G., Ratneshwar, S., & Mohanty, P. (2009). The time-harried shopper: Exploring the differences between maximizers and satisficers. Marketing Letters, 20(2), 155–167. https://doi.org/10.1007/s11002-008-9063-0
Cottrell, M., Hammer, B., Hasenfuß, A., & Villmann, T. (2006). Batch and median neural gas. Neural Networks, 19(6–7), 762–771. https://doi.org/10.1016/j.neunet.2006.05.018
Dalal, D. K., Diab, D. L., Zhu, X., & Hwang, T. (2015). Understanding the Construct of Maximizing Tendency: A Theoretical and Empirical Evaluation. Journal of Behavioral Decision Making, 28(5), 437–450. https://doi.org/10.1002/bdm.1859
De Haan, E., Kannan, P., Verhoef, P. C., & Wiesel, T. (2018). Device switching in online purchasing: Examining the strategic contingencies. Journal of Marketing, 82(5), 1–19. https://doi.org/10.1509/jm.17.0113
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1), 1–38. https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
Diab, D. L., Gillespie, M. A., & Highhouse, S. (2008). Are maximizers really unhappy? The measurement of maximizing tendency. Judgment and Decision Making, 3(5), 364. https://psycnet.apa.org/record/2008-09227-001
Ding, A. W., Li, S., & Chatterjee, P. (2015). Learning User Real-Time Intent for Optimal Dynamic Web Page Transformation. Information Systems Research, 26(2), 339–359. https://doi.org/10.1287/isre.2015.0568
Dolnicar, S., Grün, B., & Leisch, F. (2018). Market segmentation analysis: Understanding it, doing it, and making it useful: Springer Nature.
Dolnicar, S., & Leisch, F. (2010). Evaluation of structure and reproducibility of cluster solutions using the bootstrap. Marketing Letters, 21(1), 83–101. https://doi.org/10.1007/s11002-009-9083-4
Fu, W.-T., & Pirolli, P. (2007). SNIF-ACT: A cognitive model of user navigation on the world wide web. Hum.-Comput. Interact., 22(4), 355–412.
Furner, C. P., & Zinko, R. A. (2017). The influence of information overload on the development of trust and purchase intention based on online product reviews in a mobile vs. web environment: An empirical investigation. Electronic Markets, 27(3), 211–224. doi:https://doi.org/10.1007/s12525-016-0233-2
Genewein, T., Leibfried, F., Grau-Moya, J., & Braun, D. A. (2015). Bounded rationality, abstraction, and hierarchical decision-making: An information-theoretic optimality principle. Frontiers in Robotics and AI, 2, 27. https://doi.org/10.3389/frobt.2015.00027
Ghose, A., Goldfarb, A., & Han, S. P. (2013). How is the mobile Internet different? Search costs and local activities. Information Systems Research, 24(3), 613–631. https://doi.org/10.1287/isre.1120.0453
Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 857-871.https://doi.org/10.2307/2528823
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (2006). Multivariate Data Analysis (7th ed. ed. Vol. 6): Pearson Prentice Hall Upper Saddle River, NJ.
Hennig, C. (2007). Cluster-wise assessment of cluster stability. Computational Statistics & Data Analysis, 52(1), 258–271. https://doi.org/10.1016/j.csda.2006.11.025
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218. https://doi.org/10.1007/BF01908075
Häubl, G., & Trifts, V. (2000). Consumer decision making in online shopping environments: The effects of interactive decision aids. Marketing Science, 19(1), 4–21. https://doi.org/10.1287/mksc.19.1.4.15178
Iwanaga, J., Nishimura, N., Sukegawa, N., & Takano, Y. (2019). Improving collaborative filtering recommendations by estimating user preferences from clickstream data. Electronic Commerce Research and Applications, 37([100877]). https://doi.org/10.1016/j.elerap.2019.100877
Iyengar, S. S., Wells, R. E., & Schwartz, B. (2006). Doing better but feeling worse: Looking for the “best” job undermines satisfaction. Psychological Science, 17(2), 143–150. https://doi.org/10.1111/j.1467-9280.2006.01677.x
Jaccard, P. (1912). The distribution of the flora in the alpine zone. 1. New Phytologist, 11(2), 37–50. https://doi.org/10.1111/j.1469-8137.1912.tb05611.x
Kannan, P., & Li, H. (2017). Digital marketing: A framework, review and research agenda. International Journal of Research in Marketing, 34(1), 22–45. https://doi.org/10.1016/j.ijresmar.2016.11.006
Karimi, S., Holland, C. P., & Papamichail, K. N. (2018). The impact of consumer archetypes on online purchase decision-making processes and outcomes: A behavioural process perspective. Journal of Business Research, 91(May), 71–82. https://doi.org/10.1016/j.jbusres.2018.05.038
Karimi, S., Papamichail, K. N., & Holland, C. P. (2015). The effect of prior knowledge and decision-making style on the online purchase decision-making process: A typology of consumer shopping behaviour. Decision Support Systems, 77(1), 137–147. https://doi.org/10.1016/j.dss.2015.06.004
Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30(1–2), 81–93. https://doi.org/10.2307/2332226
Klein, L. R. (1998). Evaluating the potential of interactive media through a new lens: Search versus experience goods. Journal of Business Research, 41(3), 195–203. https://doi.org/10.1016/S0148-2963(97)00062-3
Kohonen, T. (1998). The self-organizing map. Neurocomputing, 21(1), 1–6. https://doi.org/10.1109/5.58325
Kruskal, W. H., & Wallis, W. A. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47(260), 583–621. https://doi.org/10.2307/2280779
Kullback, S. (1959). Statistics and information theory. J. Wiley and Sons.
Laband, D. N. (1991). An objective measure of search versus experience goods. Economic Inquiry, 29(3), 497–509. https://doi.org/10.1111/j.1465-7295.1991.tb00842.x
Lai, L. (2010). Maximizing without difficulty: A modified maximizing scale and its correlates. Judgment and Decision Making, 5(3), 164. https://psycnet.apa.org/record/2010-13808-004
Leisch, F., Dimitriadou, E., & Leisch, M. F. (2018). Package ‘flexclust’. CRAN Repository. https://cran.r-project.org/web/packages/flexclust/flexclust.pdf
Li, H., & Kannan, P. (2014). Attributing conversions in a multichannel online marketing environment: An empirical model and a field experiment. Journal of Marketing Research, 51(1), 40–56. https://doi.org/10.1509/jmr.13.0050
Lingras, P., Hogo, M., Snorek, M., & West, C. (2005). Temporal analysis of clusters of supermarket customers: Conventional versus interval set approach. Information Sciences, 172(1–2), 215–240. https://doi.org/10.1016/j.ins.2004.12.007
Ma, J., & Roese, N. J. (2014). The Maximizing Mind-Set. Journal of Consumer Research, 41(1), 71–92. https://doi.org/10.1086/674977
Mandel, N., & Johnson, E. J. (2002). When web pages influence choice: Effects of visual primes on experts and novices. Journal of Consumer Research, 29(2), 235–245. https://doi.org/10.1086/341573
Mao, W. (2016). When one desires too much of a good thing: The compromise effect under maximizing tendencies. Journal of Consumer Psychology, 26(1), 66–80. https://doi.org/10.1016/j.jcps.2015.04.007
Martinetz, T. M., Berkovich, S. G., & Schulten, K. J. (1993). Neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Transactions on Neural Networks, 4(4), 558–569. https://doi.org/10.1109/72.238311
McClure, S. M., Laibson, D. I., Loewenstein, G., & Cohen, J. D. (2004). Separate Neural Systems Value Immediate and Delayed Monetary Rewards., 306(5695), 503–507. https://doi.org/10.1126/science.1100907
Misuraca, R., Faraci, P., Gangemi, A., Carmeci, F. A., & Miceli, S. (2015). The Decision Making Tendency Inventory: A new measure to assess maximizing, satisficing, and minimizing. Personality and Individual Differences, 85, 111–116. https://doi.org/10.1016/j.paid.2015.04.043
Moe, W. W. (2003). Buying, searching, or browsing: Differentiating between online shoppers using in-store navigational clickstream. Journal of Consumer Psychology, 13(1), 29–39. https://doi.org/10.1207/S15327663JCP13-1&2_03
Montgomery, A. L., Li, S., Srinivasan, K., & Liechty, J. C. (2004). Modeling online browsing and path analysis using clickstream data. Marketing Science, 23(4), 579–595. https://doi.org/10.1287/mksc.1040.0073
Moyano-Díaz, E., & Llanos, R. M. (2020). New approaches to maximization: Evidence of correlations with malaise and well-being in the Chilean adult population. Revista CES Psicología, 13(1), 18–31. https://doi.org/10.21615/cesp.13.1.2
Nakayama, M., Sutcliffe, N., & Wan, Y. (2010). Has the web transformed experience goods into search goods? Electronic Markets, 20(3–4), 251–262. https://doi.org/10.1007/s12525-010-0041-z
Nelson, P. (1974). Advertising as information. Journal of Political Economy, 82(4), 729–754. https://www.jstor.org/stable/1837143
Nenkov, G. Y., Morrin, M., Schwartz, B., Ward, A., & Hulland, J. (2008). A short form of the Maximization Scale: Factor structure, reliability and validity studies. Judgment and Decision Making, 3(5), 371–388. https://psycnet.apa.org/record/2008-09227-002
Nottorf, F. (2014). Modeling the clickstream across multiple online advertising channels using a binary logit with Bayesian mixture of normals. Electronic Commerce Research and Applications, 13(1), 45–55. https://doi.org/10.1016/j.elerap.2013.07.004
Ölander, F. (1975). Search behavior in non-simultaneous choice situations: Satisficing or maximizing? In D. Wendt & C. Vlek (Eds.), Utility, probability, and human decision making (pp. 297–320). D. Reidel Publishing Company.
Parker, A. M., De Bruin, W. B., & Fischhoff, B. (2007). Maximizers versus satisficers: Decision-making styles, competence, and outcomes. Judgment and Decision Making, 2(6), 342–350. https://psycnet.apa.org/record/2008-00191-002
Payne, J. W., Bettman, J. R., & Johnson, E. J. (1988). Adaptive strategy selection in decision making. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14(3), 534–552. https://doi.org/10.1037/0278-7393.14.3.534
Payne, J. W., Bettman, J. R., & Johnson, E. J. (1993). The Adaptive Decision Maker. Cambridge University Press.
Pena, J. M., Lozano, J. A., & Larranaga, P. (1999). An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters, 20(10), 1027–1040. https://doi.org/10.1016/S0167-8655(99)00069-0
Pirolli, P., & Card, S. (1999). Information foraging. Psychological Review, 106(4), 643. https://doi.org/10.1037/0033-295X.106.4.643
Polman, E. (2010). Why are maximizers less happy than satisficers? Because they maximize positive and negative outcomes. Journal of Behavioral Decision Making, 23(2), 179–190. https://doi.org/10.1002/bdm.647
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846–850. https://doi.org/10.1080/01621459.1971.10482356
Rousseeuw, P. J., & Kaufman, L. (1990). Finding groups in data. Wiley.
Schellong, D., Kemper, J., & Malte, B. (2016). Clickstream data as a source to uncover consumer shopping types in a large-scale online setting. 24th European Conference on Information Systems (ECIS), Istambul, Turkey.
Schwartz, B. (2004). The paradox of choice: Why more is less. Ecco.
Schwartz, B. (2016). On the meaning and measurement of maximization. Judgment and Decision Making, 11(2), 126–146. https://psycnet.apa.org/record/2016-16702-001
Schwartz, B., Ward, A., Monterosso, J., Lyubomirsky, S., White, K., & Lehman, D. R. (2002). Maximizing versus satisficing: Happiness is a matter of choice. Journal of Personality and Social Psychology, 83(5), 1178–1197. https://doi.org/10.1037/0022-3514.83.5.1178
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464. https://www.jstor.org/stable/2958889
Shehu, E., Papies, D., & Neslin, S. A. (2020). Free shipping promotions and product returns. Journal of Marketing Research, in Press. https://doi.org/10.1177/0022243720921812
Shocker, A. D., Ben-Akiva, M., Boccara, B., & Nedungadi, P. (1991). Consideration set influences on consumer decision-making and choice: Issues, models, and suggestions. Marketing Letters, 2(3), 181–197. http://www.jstor.org/stable/40216215
Simon, H. A. (1955). A behavioral model of rational choice. Quarterly Journal of Economics, 69(1), 99–118. https://doi.org/10.2307/1884852
Sismeiro, C., & Bucklin, R. E. (2004). Modeling purchase behavior at an e-commerce web site: A task-completion approach. Journal of Marketing Research, 41(3), 306–323. https://doi.org/10.1509/jmkr.41.3.306.35985
Sneath, P. H., & Sokal, R. R. (1973). Numerical taxonomy. The principles and practice of numerical classification. W.H. Freeman and Company.
Sokal, R. R., & Rohlf, F. J. (1962). The comparison of dendrograms by objective methods. Taxon, 11(2), 33–40. https://doi.org/10.2307/1217208
Sprotles, G. B., & Kendall, E. L. (1986). A Methodology for Profiling Consumers’ Decision-Making Styles. Journal of Consumer Affairs, 20(2), 267–279. https://doi.org/10.1111/j.1745-6606.1986.tb00382.x
Van Erven, T., & Harremos, P. (2014). Rényi divergence and Kullback-Leibler divergence. IEEE Transactions on Information Theory, 60(7), 3797–3820. https://doi.org/10.1109/TIT.2014.2320500
Vesanto, J., & Alhoniemi, E. (2000). Clustering of the self-organizing map. Transactions on Neural Networks, 11(3), 586–600. https://doi.org/10.1109/72.846731
Vogrincic-Haselbacher, C., Krueger, J. I., Lurger, B., Dinslaken, I., Anslinger, J., Caks, F., Florack, A., Brohmer, H. & Athenstaedt, U. (2021). Not Too Much and Not Too Little: Information Processing for a Good Purchase Decision. Frontiers in Psychology, 12, 642641. https://doi.org/10.3389/fpsyg.2021.642641
Weathers, D., & Makienko, I. (2006). Assessing the relationships between e-tail success and product and web site factors. Journal of Interactive Marketing, 20(2), 41–54. https://doi.org/10.1002/dir.20060
Weaver, K., Daniloski, K., Schwarz, N., & Cottone, K. (2015). The role of social comparison for maximizers and satisficers: Wanting the best or wanting to be the best? Journal of Consumer Psychology, 25(3), 372–388. https://doi.org/10.1016/j.jcps.2014.10.003
Wedel, M., & Kannan, P. (2016). Marketing analytics for data-rich environments. Journal of Marketing, 80(6), 97–121. https://doi.org/10.1509/jm.15.0413
Wilcoxon, F. (1992). Individual comparisons by ranking methods. In Breakthroughs in statistics (pp. 196–202): Springer.
Yadav, M. S., & Pavlou, P. A. (2014). Marketing in computer-mediated environments: Research synthesis and new directions. Journal of Marketing, 78(1), 20–40. https://doi.org/10.1509/jm.12.0020
Yang, L., Toubia, O., & De Jong, M. G. (2015). A bounded rationality model of information search and choice in preference measurement. Journal of Marketing Research, 52(2), 166–183. https://doi.org/10.1509/jmr.13.0288
Zhang, J., Fang, X., & Liu Sheng, O. R. (2006). Online consumer search depth: Theories and new findings. Journal of Management Information Systems, 23(3), 71–95. http://www.jstor.org/stable/40398856
Acknowledgements
The author gratefully thanks the anonymous company that was the source of the data used in the study, Rune Bysted and Prof. Hans Jørn Juhl (Aarhus University) for facilitating the data, Juan Lago (Lead Supply), Karin Vinding (Aarhus University), Prof. Yun Wan (Associate Editor), and the three anonymous reviewers for their helpful comments on a previous version of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: Yun Wan
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tudoran, A.A. A machine learning approach to identifying decision-making styles for managing customer relationships. Electron Markets 32, 351–374 (2022). https://doi.org/10.1007/s12525-021-00515-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12525-021-00515-x