Memetic Genetic Algorithms for Time Series Compression by Piecewise Linear Approximation

Friedrich, Tobias; Krejca, Martin S.; Lagodzinski, J. A. Gregor; Rizzo, Manuel; Zahn, Arthur

doi:10.1007/978-3-030-63836-8_49

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12534))

Included in the following conference series:

International Conference on Neural Information Processing

2231 Accesses

Abstract

Time series are sequences of data indexed by time. Such data are collected in various domains, often in massive amounts, such that storing them proves challenging. Thus, time series are commonly stored in a compressed format. An important compression approach is piecewise linear approximation (PLA), which only keeps a small set of time points and interpolates the remainder linearly. Picking a subset of time points such that the PLA minimizes the mean squared error to the original time series is a challenging task, naturally lending itself to heuristics. We propose the piecewise linear approximation genetic algorithm (PLA-GA) for compressing time series by PLA. The PLA-GA is a memetic \((\mu + \lambda )\) GA that makes use of two distinct operators tailored to time series compression. First, we add special individuals to the initial population that are derived using established PLA heuristics. Second, we propose a novel local search operator that greedily improves a compressed time series. We compare the PLA-GA empirically with existing evolutionary approaches and with a deterministic PLA algorithm, known as Bellman’s algorithm, that is optimal for the restricted setting of sampling. In both cases, the PLA-GA approximates the original time series better and quicker. Further, it drastically outperforms Bellman’s algorithm with increasing instance size with respect to run time until finding a solution of equal or better quality – we observe speed-up factors between 7 and 100 for instances of 90, 000 to 100, 000 data points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Variable-Chromosome-Length Genetic Algorithm for Time Series Discretization

Optimized Word-Size Time Series Representation Method Using a Genetic Algorithm with a Flexible Encoding Scheme

TSXor: A Simple Time Series Compression Algorithm

Notes

1.
https://github.com/arthurz0/ga-for-time-series-compression-by-pla/.
2.
Bellman’s algorithm was modified to calculate the segment errors incrementally on the fly. This reduced the memory complexity from \(\mathcal {O}(n^2)\) to \(\mathcal {O}(mn)\), allowing us to run larger experiments. This modification does not affect the asymptotic run time.

References

Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Proceedings of FODO 1993, pp. 69–84 (1993)
Google Scholar
Aruoba, S.B., Fernández-Villaverde, J.: A comparison of programming languages in economics. Working Paper 20263, National Bureau of Economic Research (2014)
Google Scholar
Baragona, R., Battaglia, F., Poli, I.: Evolutionary Statistical Procedures. An evolutionary computation approach to statistical procedures designs and applications. Springer (2011).https://doi.org/10.1007/978-3-6SD42-16218-3
Bellman, R., Kotkin, B.: On the approximation of curves by line segments using dynamic programming. RAND Corporation, II. Technical report (1962)
Google Scholar
Cucina, D., Rizzo, M., Ursu, E.: Multiple changepoint detection for periodic autoregressive models with an application to river flow analysis. Stoch. Environ. Res. Risk Assess. 33(4–6), 1137–1157 (2019)
Article Google Scholar
Doerr, B., Doerr, C., Kötzing, T.: Static and self-adjusting mutation strengths for multi-valued decision variables. Algorithmica 80(5), 1732–1768 (2018)
Article MathSciNet Google Scholar
Doerr, B., Fischer, P., Hilbert, A., Witt, C.: Detecting structural breaks in time series via genetic algorithms. Soft. Comput. 21(16), 4707–4720 (2017)
Article Google Scholar
Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartograph. Int. J. Geograph. Inf. Geovis. 10(2), 112–122 (1973)
Google Scholar
Dunning, T., Friedman, E.: Time Series Databases: New Ways to Store and Access Data. O’Reilly and Associates, Sebastopol (2014)
Google Scholar
Durán-Rosal, A.M., Gutiérrez, P.A., Poyato, Á.C., Hervás-Martínez, C.: A hybrid dynamic exploitation barebones particle swarm optimisation algorithm for time series segmentation. Neurocomputing 353, 45–55 (2019)
Article Google Scholar
Durán-Rosal, A.M., Gutiérrez, P.A., Salcedo-Sanz, S., Hervás-Martínez, C.: Dynamical memetization in coral reef optimization algorithms for optimal time series approximation. Progress in AI 8(2), 253–262 (2019)
Google Scholar
Durán-Rosal, A.M., Peáa, P.A.G., Martínez-Estudillo, F.J., Hervás-Martínez, C.: Time series representation by a novel hybrid segmentation algorithm. In: Proceedings of HAIS 2016, vol. 9648, pp. 163–173 (2016)
Google Scholar
Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. NCS. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-44874-8
Book MATH Google Scholar
Elmeleegy, H., Elmagarmid, A.K., Cecchet, E., Aref, W.G., Zwaenepoel, W.: Online piecewise linear approximation of numerical streams with precision guarantees. PVLDB 2(1), 145–156 (2009)
Google Scholar
Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surveys (CSUR) 45(1), 1–34 (2012)
Article Google Scholar
Fan, J., Yao, Q.: Nonlinear Time Series. Nonparametric and parametric methods. Springer (2008). https://doi.org/10.1007/978-0-387-69395-8
Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)
Article Google Scholar
Hollmig, G., et al.: An evaluation of combinations of lossy compression and change-detection approaches for time-series data. Inf. Syst. 65, 65–77 (2017)
Article Google Scholar
Hung, N.Q.V., Jeung, H., Aberer, K.: An evaluation of model-based approaches to sensor data compression. IEEE Trans. Knowl. Data Eng. 25(11), 2434–2447 (2013)
Article Google Scholar
Hurvitz, P.M.: GPS and accelerometer time stamps: proper data handling and avoiding pitfalls. In: Proceedings of ACM SIGSPATIAL 2015, pp. 94–100 (2015)
Google Scholar
Keogh, E., Chu, S., Hart, D., Pazzani, M.: Segmenting time series: a survey and novel approach. In: Data Mining in Time Series Databases, pp. 1–21. World Scientific (2004)
Google Scholar
Last, M., Horst, B., Abraham, K.: Data mining in time series databases. World Scientific (2004)
Google Scholar
Lin, J.W., Liao, S.W., Leu, F.Y.: Sensor data compression using bounded error piecewise linear approximation with resolution reduction. Energies 12(13), 2523 (2019)
Google Scholar
Lovrić, M., Milanović, M., Stamenković, M.: Algoritmic methods for segmentation of time series: an overview. JCBI 1(1), 31–53 (2014)
Google Scholar
Marmarelis, M.G.: Efficient and robust polylinear analysis of noisy time series. arXiv preprint, CoRR abs/1704.02577 (2017)
Google Scholar
Ramer, U.: An iterative procedure for the polygonal approximation of plane curves. Comput. Graph. Image Process. 1(3), 244–256 (1972)
Article Google Scholar
Ratanamahatana, C.A., Lin, J., Gunopulos, D., Keogh, E., Vlachos, M., Das, G.: Mining time series data. In: Data Mining and Knowledge Discovery Handbook, pp. 1069–1103. Springer, Boston (2005). https://doi.org/10.1007/0-387-25465-X_51
Reiss, A., Stricker, D.: Creating and benchmarking a new dataset for physical activity monitoring. In: Proceedings of PETRA 2012, pp. 1–8 (2012)
Google Scholar
Reiss, A., Stricker, D.: Introducing a new benchmarked dataset for activity monitoring. In: Proceedings of ISWC 2012, pp. 108–109 (2012)
Google Scholar
Sinaie, S., Nguyen, V.P., Nguyen, C.T., Bordas, S.: Programming the material point method in Julia. Adv. Eng. Softw. 105, 17–29 (2017)
Article Google Scholar
Struzik, Z.R., Siebes, A.: Wavelet transform in similarity paradigm. In: Proceedings of PAKDD, pp. 295–309 (1998)
Google Scholar
The UCR time series classification archive. http://www.springer.com/lncs. Accessed 27 June 2020
UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 27 June 2020
Visvalingam, M., Whyatt, J.D.: Line generalisation by repeated elimination of points. Cartograph. J. 30(1), 46–51 (1993)
Article Google Scholar
Welch, W.: Algorithmic complexity: three NP-hard problems in computational statistics. J. Stat. Comput. Simul. 15, 17–25 (1982)
Google Scholar
Zhao, H., Dong, Z., Li, T., Wang, X., Pang, C.: Segmenting time series with connected lines under maximum error bound. Inf. Sci. 345, 1–8 (2016)
Article Google Scholar
Zordan, D., Martínez, B., Vilajosana, I., Rossi, M.: On the performance of Lossy compression schemes for energy constrained sensor networking. ACM Trans. Sens. Netw. 11(1), 15:1–15:34 (2014)
Google Scholar

Download references

Acknowledgments

We thank Durán-Rosal et al. [10,11,12] very much for immediately providing us with the source code of their evolutionary algorithms upon request.

Author information

Authors and Affiliations

Hasso Plattner Institute, University of Potsdam, Potsdam, Germany
Tobias Friedrich, Martin S. Krejca, J. A. Gregor Lagodzinski, Manuel Rizzo & Arthur Zahn

Authors

Tobias Friedrich
View author publications
You can also search for this author in PubMed Google Scholar
Martin S. Krejca
View author publications
You can also search for this author in PubMed Google Scholar
J. A. Gregor Lagodzinski
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Rizzo
View author publications
You can also search for this author in PubMed Google Scholar
Arthur Zahn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin S. Krejca .

Editor information

Editors and Affiliations

Department of AI, Ping An Life, Shenzhen, China
Haiqin Yang
Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi-Sing Leung
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
James T. Kwok
School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
Jonathan H. Chan
The Chinese University of Hong Kong, New Territories, Hong Kong
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Friedrich, T., Krejca, M.S., Lagodzinski, J.A.G., Rizzo, M., Zahn, A. (2020). Memetic Genetic Algorithms for Time Series Compression by Piecewise Linear Approximation. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_49

Download citation

DOI: https://doi.org/10.1007/978-3-030-63836-8_49
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63835-1
Online ISBN: 978-3-030-63836-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Memetic Genetic Algorithms for Time Series Compression by Piecewise Linear Approximation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Variable-Chromosome-Length Genetic Algorithm for Time Series Discretization

Optimized Word-Size Time Series Representation Method Using a Genetic Algorithm with a Flexible Encoding Scheme

TSXor: A Simple Time Series Compression Algorithm

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Memetic Genetic Algorithms for Time Series Compression by Piecewise Linear Approximation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Variable-Chromosome-Length Genetic Algorithm for Time Series Discretization

Optimized Word-Size Time Series Representation Method Using a Genetic Algorithm with a Flexible Encoding Scheme

TSXor: A Simple Time Series Compression Algorithm

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation