Abstract
The burgeoning age of IoT has reinforced the need for robust time series anomaly detection. While there are hundreds of anomaly detection methods in the literature, one definition, time series discords, has emerged as a competitive and popular choice for practitioners. Time series discords are subsequences of a time series that are maximally far away from their nearest neighbors. Perhaps the most attractive feature of discords is their simplicity. Unlike many of the parameter-laden methods proposed, discords require only a single parameter to be set by the user: the subsequence length. We believe that the utility of discords is reduced by sensitivity to even this single user choice. The obvious solution to this problem, computing discords of all lengths then selecting the best anomalies (under some measure), appears at first glance to be computationally untenable. However, in this work we discuss MERLIN, a recently introduced algorithm that can efficiently and exactly find discords of all lengths in massive time series archives. By exploiting computational redundancies, MERLIN is two orders of magnitude faster than comparable algorithms. Moreover, we show that by exploiting a little-known indexing technique called Orchard’s algorithm, we can create a new algorithm called MERLIN++, which is an order of magnitude faster than MERLIN, yet produces identical results. We demonstrate the utility of our ideas on a large and diverse set of experiments and show that MERLIN++ can discover subtle anomalies that defy existing algorithms or even careful human inspection. We further compare to five state-of-the-art rival methods, on the largest benchmark dataset for this task, and show that MERLIN++ is superior in terms of accuracy and speed.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Note that some algorithms that discover discords may have other parameters, the discord representation itself requires just a single parameter.
Note that this DST anomaly is misidentified in the original work that introduced this dataset as the NY-Marathon anomaly (Ahmad et al. 2017). This misidentification has since been repeated in dozens of papers. We are confident that our labeling is correct. If we correctly process the data with the standard DST algorithm count(1–2am) = ½ apparent count(1–am), then the apparent anomaly disappears.
This name is a play on the fact that the first paper on time series discords was titled “Approximations to Magic” (Lin et al. 2005). Merlin was the magician of the Arthurian legend. In addition, Mitsubishi Electric Corporation’s subsidiary in the USA is called MERL (Mitsubishi Electric Research Laboratory, Boston).
Orchard’s algorithm is obscure enough that when googling for “Orchard’s algorithm”, almost half the results are about algorithms for literal fruit processing.
An order of magnitude simpler in terms of number of parameters to set, of the number of lines of code written etc.
To preempt confusion, we note that Page is an author.
References
Alireza A, Sara A, Shima I, Murillo AC, Gerry AC, Hickle L, Keogh EJ (2020) Fitbit for chickens?: time series data mining can increase the productivity of poultry farms. KDD 2020:3328–3336
Ahmed M, and Abdun Naser M (2014) "Network traffic pattern analysis using improved information theoretic co-clustering based collective anomaly detection." In International conference on security and privacy in communication networks, pp. 204–219. Springer, Cham
Ahmad S, Lavin A, Purdy S, Agha Z (2017) Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262:134–147
Audibert J, Marti S, Guyard F, Zuluaga MA (2021) From univariate to multivariate time series anomaly detection with non-local information. In: Lemaire V, Malinowski S, Bagnall A et al (eds) Advanced analytics and learning on temporal data. Springer International Publishing
Barz B, Yanira Guanche G, Erik R, and Joachim D (2017) "Maximally divergent intervals for extreme weather event detection." In OCEANS 2017-Aberdeen, pp. 1–9. IEEE
Beyer K, Jonathan G, Raghu R, and Uri S (1999) "When is “nearest neighbor” meaningful?" In International conference on database theory, pp. 217–235. Springer, Berlin, Heidelberg
Boniol P, Linardi M, Roncallo F et al (2021) Unsupervised and scalable subsequence anomaly detection in large data series. VLDB J. https://doi.org/10.1007/s00778-021-00655-8
Bu Y, Lei C, Ada W-CF, and Dawei L (2009) "Efficient anomaly monitoring over moving object trajectory streams." In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 159–168. 2009.
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. In ACM Comput Surv (CSUR) 41(3):1–58
Daigavane A, Wagstaff KL, Doran G, Cochrane C, Jackman C, and Rymer A (2020). Detection of Environment Transitions in Time Series Data for Responsive Science. In MileTS ’20: 6th KDD Workshop on Mining and Learning from Time Series, August 24th, 2020, San Diego, California, USA. ACM, New York, NY, USA, 5 pages.
Däubener Sina, Sebastian Schmitt, Hao Wang, Thomas Bäck, Peter krause, “Large Anomaly Detection in Univariate Time Series: An Empirical Comparison of Machine Learning Algorithms,” 19th Industrial Conference on Data Mining ICDM, 2019.
Ding H, Trajcevski G, Scheuermann P et al (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Endow 1:1542–1552. https://doi.org/10.14778/1454159.1454226
Doan Minh T, Sutharshan R, Mahsa S, Masud M, and Christopher L (2015) "Profiling pedestrian distribution and anomaly detection in a dynamic environment." In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1827–1830. 2015.
Filonov P, Andrey L, and Artem V (2016) "Multivariate industrial time series with cyber-attack simulation: Fault detection using an lstm-based predictive data model."http.arxiv.org./1612.06676
Heldt T, Oefinger MB, Hoshiyama M, Mark RG (2003) Circulatory response topassive and active changes in posture. Comput Cardiol 30:263–266
Huet A, Navarro JM, Rossi D (2022) Local Evaluation of Time Series Anomaly Detection Algorithms. arXiv preprint arXiv:2206.13167.
Hundman K, Valentino C, Christopher L, Ian C, and Tom S (2018) "Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding." In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 387–395. 2018.
Hwang W-S, Yun J-H, Kim J, Min BG (2022) Do you know existing accuracy metrics overrate time-series anomaly detections? In: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing. Association for Computing Machinery, New York, NY, USA, pp 403–412
Imamura M, Takaaki N, Eamonn K (2020). Matrix Profile XXI: A Geometric Approach to Time Series Chains Improves Robustness. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pages 1114–1122
Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3:263–286. https://doi.org/10.1007/PL00011669
Keogh E, Jessica L, and Ada F (2005) "Hot sax: Efficiently finding the most unusual time series subsequence." In Fifth IEEE International Conference on Data Mining (ICDM'05), pp. 8-pp. Ieee
Kim S, Choi K, Choi H-S et al (2022) Towards a rigorous evaluation of time-series anomaly detection. Proc AAAI Conf Artif Intell 36:7194–7201. https://doi.org/10.1609/aaai.v36i7.20680
Laptev N and Saeed A, “S5 - A Labeled Anomaly Detection Dataset, version 1.0(16M).” 2015. Distributed by Yahoo Research. https://webscope.sandbox.yahoo.com/catalog.php?datatype=s&did=70.
Lin J, Eamonn K, Ada F, and Helga Van H (2005) "Approximations to magic: Finding unusual medical time series." In 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05), pp. 329–334. IEEE
Linardi M, Zhu Y, Palpanas T, Keogh E (2020) Matrix profile goes MAD: variable-length motif and discord discovery in data series. Data Min Knowl Discov 34(4):1022–1071
Marcus M, and Minc H (1992). A survey of matrix theory and matrix inequalities (Vol. 14). Courier Corporation.
McRae M, Melbourne D- Bourke St Mall Flash Mob - 29th June 2013. (Jul. 20, 2013). Accessed: May 22, 2020. [Online Video]. Available: www.youtube.com/watch?v=gLzDFjiRQE8.
Mueen A (2015). The Fastest Similarity Search Algorithm for Time Series Subsequences under Euclidean Distance, URL: http://www.cs.unm.edu/~mueen/FastestSimilaritySearch.html
Murray D, Stankovic L, Stankovic V, and Liao J (2015) University of Strathclyde, PURE http://dx.doi.org/https://doi.org/10.15129/31da3ece-f902-4e95-a093-e0a9536983c4
Nakamura T (2021) MERLIN Webpage: https://sites.google.com/view/merlin-find-anomalies.
Nakamura T, Imamura M, Mercer R, Keogh E (2020) “MERLIN: Parameter-Free Discovery of Arbitrary Length Anomalies in Massive Time Series Archives”. ICDM: 1190–1195
Nichiforov C, Stancu I, Stamatescu I, Stamatescu G (2020) Information Extraction Approach for Energy Time Series Modelling, 24th International Conference on System Theory, Control and Computing, ICSTCC 2020, October 8–10, 2020, Sinaia, Romania.
Orchard MT (1991) A fast nearest-neighbor search algorithm. International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2297–2300, IEEE Computer Society Press.
Page, E. S. "On problems in which a change in a parameter occurs at an unknown point." Biometrika 44, no. 1/2 (1957): 248–252.
Park D, Hoshi Y, Kemp CC (2018) A multimodal anomaly detector for robot-assisted feeding using an LSTM-Based variational autoencoder. IEEE Robot Autom Lett 3:1544–1551. https://doi.org/10.1109/LRA.2018.2801475
Pedestrian Counting System (2013) City of Melbourne—Pedestrian counting system. In: Pedestrian Counting System. http://www.pedestrian.melbourne.vic.gov.au/#date=28-10-2021&time=8. Accessed May 22, 2020.
Thanawin R, Bilson JLC, Abdullah M, Gustavo EAPAB, Brandon Westover M, Qiang Z, Jesin Z, Eamonn JK (2013) Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping. ACM Trans Knowl Discov Data 7(3):1–31
Thirey B and Hickman R (2015) Distribution of euclidean distances between randomly distributed gaussian points in n-space. arXiv:1508.02238.
Vasheghani-Farahani I, Alex C, Russell EK, Michael GK, and Brad K (2019) "Time Series Anomaly Detection from a Markov Chain Perspective." In 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1000–1007. IEEE
Wang JT-L, Wang X, Lin K-I, et al (1999) Evaluating a class of distance-mapping algorithms for data mining and clustering. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’99. ACM Press, San Diego, California, United States, pp 307–311
Wei, Li, Keogh, Eamonn, Xi, Xiaopeng. SAXually Explicit Images: Finding Unusual Shapes. ICDM 2006: 711–720
Wikipedia (2022) "January 2017 Melbourne car attack.", last modified July 3, 2022, 1:21, https://en.wikipedia.org/wiki/January_2017_Melbourne_car_attack.
Wu, R and Keogh Eamonn (2021) Current Time Series Anomaly Detection Benchmarks are Flawed and are Creating the Illusion of Progress. CoRR abs/2009.13807 (2020)
Yankov D, Keogh E, Rebbapragada U (2008) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. Knowl Inf Syst 17(2S):241–262
Yeh C-CM, Yan Zhu LU, Nurjahan B, Yifei D, Hoang AD, Diego FS, Abdullah M, and Eamonn K (2016) "Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets." In 2016 IEEE 16th international conference on data mining (ICDM), pp. 1317–1322. Ieee.
Zhang V (2019), “A tour of AI technologies in time series prediction.” Society of Actuaries.
Funding
Funding was provided by National Science Foundation (Grant no. 1631776)
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: Panagiotis Papapetrou.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Some benchmark datasets are trivial
In the main text we noted that some fraction of the benchmark data yield to simple algorithms from the 1950s (Page 1957). Here we demonstrate that claim. This is important because it confounds any comparison of algorithms. For example, suppose we find that Olympic powerlifter Long Qingquan can lift 1, 2, 3 and 300 kg, and that the current author can lift 1, 2 and 3 kg. It would be foolish to conclude that because they agree on ¾ of the lifting tasks, that they are almost equally strong.
A further simplified version of the sixty-three-year-old algorithm in (Page 1957) is:
flag = zeros(size(T)); %% Code can be run in Matlab |
for i = 4 : length(T)-4 |
if std(T(i + 1:i + 4)) − td(T(i-3:i)) > 1, flag(i) = 1;, end; |
end; |
In Fig. 25 we show the results of running this code on two benchmark datasets that yield to such simple algorithms.
Appendix B : choosing a value for L
A reasonable value for L is about one period, say one heartbeat, one day, one machine cycle etc. The following snippet of Matlab code will return one period.
[autocor,lags] = xcorr(T,'coeff'); |
[~ ,m] = findpeaks(autocor(length(T) + 10:length(T) + 1000),… |
lags(length(T) + 10:length(T) + 1000),'SortStr','descend','NPeaks',1); |
m(isempty(m)) = 1000; |
m = floor(m); |
While this is reasonable heuristic, it is not perfect. For example, an anomaly may show up only at a fraction of a period length, but then get swamped if you consider a full period. This is why the multiscale property of MERLIN++ is so useful, it frees the user from needing to be careful in their choice of a parameter.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nakamura, T., Mercer, R., Imamura, M. et al. MERLIN++: parameter-free discovery of time series anomalies. Data Min Knowl Disc 37, 670–709 (2023). https://doi.org/10.1007/s10618-022-00876-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-022-00876-7