Memetic Genetic Algorithms for Time Series Compression by Piecewise Linear Approximation | SpringerLink
Skip to main content

Memetic Genetic Algorithms for Time Series Compression by Piecewise Linear Approximation

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2020)

Abstract

Time series are sequences of data indexed by time. Such data are collected in various domains, often in massive amounts, such that storing them proves challenging. Thus, time series are commonly stored in a compressed format. An important compression approach is piecewise linear approximation (PLA), which only keeps a small set of time points and interpolates the remainder linearly. Picking a subset of time points such that the PLA minimizes the mean squared error to the original time series is a challenging task, naturally lending itself to heuristics. We propose the piecewise linear approximation genetic algorithm (PLA-GA) for compressing time series by PLA. The PLA-GA is a memetic \((\mu + \lambda )\) GA that makes use of two distinct operators tailored to time series compression. First, we add special individuals to the initial population that are derived using established PLA heuristics. Second, we propose a novel local search operator that greedily improves a compressed time series. We compare the PLA-GA empirically with existing evolutionary approaches and with a deterministic PLA algorithm, known as Bellman’s algorithm, that is optimal for the restricted setting of sampling. In both cases, the PLA-GA approximates the original time series better and quicker. Further, it drastically outperforms Bellman’s algorithm with increasing instance size with respect to run time until finding a solution of equal or better quality – we observe speed-up factors between 7 and 100 for instances of 90, 000 to 100, 000 data points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/arthurz0/ga-for-time-series-compression-by-pla/.

  2. 2.

    Bellman’s algorithm was modified to calculate the segment errors incrementally on the fly. This reduced the memory complexity from \(\mathcal {O}(n^2)\) to \(\mathcal {O}(mn)\), allowing us to run larger experiments. This modification does not affect the asymptotic run time.

References

  1. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Proceedings of FODO 1993, pp. 69–84 (1993)

    Google Scholar 

  2. Aruoba, S.B., Fernández-Villaverde, J.: A comparison of programming languages in economics. Working Paper 20263, National Bureau of Economic Research (2014)

    Google Scholar 

  3. Baragona, R., Battaglia, F., Poli, I.: Evolutionary Statistical Procedures. An evolutionary computation approach to statistical procedures designs and applications. Springer (2011).https://doi.org/10.1007/978-3-6SD42-16218-3

  4. Bellman, R., Kotkin, B.: On the approximation of curves by line segments using dynamic programming. RAND Corporation, II. Technical report (1962)

    Google Scholar 

  5. Cucina, D., Rizzo, M., Ursu, E.: Multiple changepoint detection for periodic autoregressive models with an application to river flow analysis. Stoch. Environ. Res. Risk Assess. 33(4–6), 1137–1157 (2019)

    Article  Google Scholar 

  6. Doerr, B., Doerr, C., Kötzing, T.: Static and self-adjusting mutation strengths for multi-valued decision variables. Algorithmica 80(5), 1732–1768 (2018)

    Article  MathSciNet  Google Scholar 

  7. Doerr, B., Fischer, P., Hilbert, A., Witt, C.: Detecting structural breaks in time series via genetic algorithms. Soft. Comput. 21(16), 4707–4720 (2017)

    Article  Google Scholar 

  8. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartograph. Int. J. Geograph. Inf. Geovis. 10(2), 112–122 (1973)

    Google Scholar 

  9. Dunning, T., Friedman, E.: Time Series Databases: New Ways to Store and Access Data. O’Reilly and Associates, Sebastopol (2014)

    Google Scholar 

  10. Durán-Rosal, A.M., Gutiérrez, P.A., Poyato, Á.C., Hervás-Martínez, C.: A hybrid dynamic exploitation barebones particle swarm optimisation algorithm for time series segmentation. Neurocomputing 353, 45–55 (2019)

    Article  Google Scholar 

  11. Durán-Rosal, A.M., Gutiérrez, P.A., Salcedo-Sanz, S., Hervás-Martínez, C.: Dynamical memetization in coral reef optimization algorithms for optimal time series approximation. Progress in AI 8(2), 253–262 (2019)

    Google Scholar 

  12. Durán-Rosal, A.M., Peáa, P.A.G., Martínez-Estudillo, F.J., Hervás-Martínez, C.: Time series representation by a novel hybrid segmentation algorithm. In: Proceedings of HAIS 2016, vol. 9648, pp. 163–173 (2016)

    Google Scholar 

  13. Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. NCS. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-44874-8

    Book  MATH  Google Scholar 

  14. Elmeleegy, H., Elmagarmid, A.K., Cecchet, E., Aref, W.G., Zwaenepoel, W.: Online piecewise linear approximation of numerical streams with precision guarantees. PVLDB 2(1), 145–156 (2009)

    Google Scholar 

  15. Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surveys (CSUR) 45(1), 1–34 (2012)

    Article  Google Scholar 

  16. Fan, J., Yao, Q.: Nonlinear Time Series. Nonparametric and parametric methods. Springer (2008). https://doi.org/10.1007/978-0-387-69395-8

  17. Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)

    Article  Google Scholar 

  18. Hollmig, G., et al.: An evaluation of combinations of lossy compression and change-detection approaches for time-series data. Inf. Syst. 65, 65–77 (2017)

    Article  Google Scholar 

  19. Hung, N.Q.V., Jeung, H., Aberer, K.: An evaluation of model-based approaches to sensor data compression. IEEE Trans. Knowl. Data Eng. 25(11), 2434–2447 (2013)

    Article  Google Scholar 

  20. Hurvitz, P.M.: GPS and accelerometer time stamps: proper data handling and avoiding pitfalls. In: Proceedings of ACM SIGSPATIAL 2015, pp. 94–100 (2015)

    Google Scholar 

  21. Keogh, E., Chu, S., Hart, D., Pazzani, M.: Segmenting time series: a survey and novel approach. In: Data Mining in Time Series Databases, pp. 1–21. World Scientific (2004)

    Google Scholar 

  22. Last, M., Horst, B., Abraham, K.: Data mining in time series databases. World Scientific (2004)

    Google Scholar 

  23. Lin, J.W., Liao, S.W., Leu, F.Y.: Sensor data compression using bounded error piecewise linear approximation with resolution reduction. Energies 12(13), 2523 (2019)

    Google Scholar 

  24. Lovrić, M., Milanović, M., Stamenković, M.: Algoritmic methods for segmentation of time series: an overview. JCBI 1(1), 31–53 (2014)

    Google Scholar 

  25. Marmarelis, M.G.: Efficient and robust polylinear analysis of noisy time series. arXiv preprint, CoRR abs/1704.02577 (2017)

    Google Scholar 

  26. Ramer, U.: An iterative procedure for the polygonal approximation of plane curves. Comput. Graph. Image Process. 1(3), 244–256 (1972)

    Article  Google Scholar 

  27. Ratanamahatana, C.A., Lin, J., Gunopulos, D., Keogh, E., Vlachos, M., Das, G.: Mining time series data. In: Data Mining and Knowledge Discovery Handbook, pp. 1069–1103. Springer, Boston (2005). https://doi.org/10.1007/0-387-25465-X_51

  28. Reiss, A., Stricker, D.: Creating and benchmarking a new dataset for physical activity monitoring. In: Proceedings of PETRA 2012, pp. 1–8 (2012)

    Google Scholar 

  29. Reiss, A., Stricker, D.: Introducing a new benchmarked dataset for activity monitoring. In: Proceedings of ISWC 2012, pp. 108–109 (2012)

    Google Scholar 

  30. Sinaie, S., Nguyen, V.P., Nguyen, C.T., Bordas, S.: Programming the material point method in Julia. Adv. Eng. Softw. 105, 17–29 (2017)

    Article  Google Scholar 

  31. Struzik, Z.R., Siebes, A.: Wavelet transform in similarity paradigm. In: Proceedings of PAKDD, pp. 295–309 (1998)

    Google Scholar 

  32. The UCR time series classification archive. http://www.springer.com/lncs. Accessed 27 June 2020

  33. UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 27 June 2020

  34. Visvalingam, M., Whyatt, J.D.: Line generalisation by repeated elimination of points. Cartograph. J. 30(1), 46–51 (1993)

    Article  Google Scholar 

  35. Welch, W.: Algorithmic complexity: three NP-hard problems in computational statistics. J. Stat. Comput. Simul. 15, 17–25 (1982)

    Google Scholar 

  36. Zhao, H., Dong, Z., Li, T., Wang, X., Pang, C.: Segmenting time series with connected lines under maximum error bound. Inf. Sci. 345, 1–8 (2016)

    Article  Google Scholar 

  37. Zordan, D., Martínez, B., Vilajosana, I., Rossi, M.: On the performance of Lossy compression schemes for energy constrained sensor networking. ACM Trans. Sens. Netw. 11(1), 15:1–15:34 (2014)

    Google Scholar 

Download references

Acknowledgments

We thank Durán-Rosal et al.  [10,11,12] very much for immediately providing us with the source code of their evolutionary algorithms upon request.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin S. Krejca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Friedrich, T., Krejca, M.S., Lagodzinski, J.A.G., Rizzo, M., Zahn, A. (2020). Memetic Genetic Algorithms for Time Series Compression by Piecewise Linear Approximation. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63836-8_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63835-1

  • Online ISBN: 978-3-030-63836-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics