Abstract
The standard sequential pattern mining scheme hardly considers the positions of events in a sequence, and therefore it is difficult to focus on more interesting patterns that represent better the causal relationships between events. Without quantifying how close two events are in a sequence, we may fail to evaluate how likely an event is caused by the others from the pattern, which is a severe drawback for some applications like prediction. Motivated by this, we propose the recency-based sequential pattern mining scheme together with a novel measure of pattern interestingness to effectively capture recency as well as frequency. To efficiently extract all the recency-based sequential patterns, we devise a mining algorithm, called Recency-based Frequent pattern Miner (RF-Miner), together with an effective prediction method to evaluate the quality of recency-based patterns in terms of their prediction power. The experimental results show that our RF-Miner algorithm can extract more diverse and important patterns that can be used to make prediction of the next event, and can be more efficiently performed by using the upper bounds of our measure than baseline algorithms.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
We truncated each sequence to a quarter of its original, length for reducing the execution time of all algorithms from the original FIFA data set available from http://www.philippe-fournier-viger.com/spmf/index.php?link=datasets.php.
References
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the international conference on data engineering, pp 3–14
Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD international conference on management of data, Washington, pp 207–216
Ao X, Luo P, Wang J, Zhuang F, He Q (2017) Mining precise-positioning episode rules from event sequences. In: IEEE international conference on data engineering, pp 83–86
Ao X, Luo P, Wang J, Zhuang F, He Q (2018) Mining precise-positioning episode rules from event sequences. IEEE Trans Knowl Data Eng 30(3):530–543
Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 429–435
Chen Y, Chen S, Hsu P (2002) Mining hybrid sequential patterns and sequential rules. Inf Syst 27(5):345–362
Colman A (2009) A dictionary of psychology. Oxford dictionary of psychology. Oxford University Press, Oxford
Cule B, Goethals B, Robardet C (2009) A new constraint for mining sets in sequences. In: Proceedings of the SIAM international conference on data mining, pp 317–328
Cule B, Feremans L, Goethals B (2016) Efficient discovery of sets of co-occurring items in event sequences. In: Machine learning and knowledge discovery in databases—European conference, pp 361–377
Feremans L, Cule B, Goethals B (2018) Mining top-k quantile-based cohesive sequential patterns. In: Proceedings of the SIAM international conference on data mining, pp 90–98
Fiot C, Laurent A, Teisseire M (2007) Extended time constraints for sequence mining. In: 14th international symposium on temporal representation and reasoning (TIME 2007), 28–30 June 2007. Alicante, Spain, pp 105–116
Fournier-Viger P, Gueniche T, Tseng VS (2012) Using partially-ordered sequential rules to generate more accurate sequence prediction. In: Advanced data mining and applications, international conference, ADMA, pp 431–442
Fournier-Viger P, Lin JC-W, Kiran RU, Koh YS, Thomas R (2017) A survey of sequential pattern mining. Data Sci Pattern Recognit 1(1):54–77
Hetland ML, Sætrom P (2005) Evolutionary rule mining in time series databases. Mach Learn 58(2–3):107–125
Hirate Y, Yamana H (2006) Generalized sequential pattern mining with item intervals. JCP 1(3):51–60
Ho J, Lukov L, Chawla S (2005) Sequential pattern mining with constraints on large protein databases. In: Proceedings of the international conference on management of data (COMAD), pp 89–100
Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Discov 1(3):259–289
Mobasher B, Dai H, Luo T, Nakagawa M (2002) Using sequential and non-sequential patterns in predictive web usage mining tasks. In: Proceedings of the IEEE international conference on data mining, pp 669–672
Nakagawa M, Mobasher B (2003) Impact of site characteristics on recommendation models based on association rules and sequential patterns. Proc IJCAI 3:1–10
Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu M (2001) Prefixspan: mining sequential patterns by prefix-projected growth. In: Proceedings of the international conference on data engineering, pp 215–224
Pei J, Han J, Wang W (2002) Mining sequential patterns with constraints in large databases. In: Proceedings of the ACM CIKM international conference on information and knowledge management, pp 18–25
Pei J, Han J, Wang W (2007) Constraint-based sequential pattern mining: the pattern-growth methods. J Intell Inf Syst 28(2):133–160
Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: International conference on extending database technology, pp 3–17
Tang L, Zhang L, Luo P, Wang M (2012) Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. In: ACM international conference on information and knowledge management, pp 75–84
Zaki MJ (2000) Sequence mining in categorical domains: incorporating constraints. In: Proceedings of the ACM CIKM international conference on information and knowledge management, pp 422–429
Zaki MJ (2001) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn 42(1/2):31–60
Zhou C, Cule B, Goethals B (2015) A pattern based predictor for event streams. Expert Syst Appl 42(23):9294–9306
Acknowledgements
This work was supported in part by the National Research Foundation of Korea (NRF) Grant Funded by the Korea government (MSIT) (NRF-2018R1D1A1B07049934), in part by Institute of Information & Communications Technology Planning & Evaluation (IITP) Grants Funded by the Korea government (MSIT) (2019-0-00240, 2019-0-00064, 2017-0-00396, and 2020-0-01389, Artificial Intelligence Convergence Research Center (Inha University)), and in part by INHA UNIVERSITY Research Grant.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: M. J. Zaki.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kim, H., Choi, DW. Recency-based sequential pattern mining in multiple event sequences. Data Min Knowl Disc 35, 127–157 (2021). https://doi.org/10.1007/s10618-020-00715-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-020-00715-7