Time series extrinsic regression | Data Mining and Knowledge Discovery Skip to main content
Log in

Time series extrinsic regression

Predicting numeric values from time series data

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

This paper studies time series extrinsic regression (TSER): a regression task of which the aim is to learn the relationship between a time series and a continuous scalar variable; a task closely related to time series classification (TSC), which aims to learn the relationship between a time series and a categorical class label. This task generalizes time series forecasting, relaxing the requirement that the value predicted be a future value of the input series or primarily depend on more recent values. In this paper, we motivate and study this task, and benchmark existing solutions and adaptations of TSC algorithms on a novel archive of 19 TSER datasets which we have assembled. Our results show that the state-of-the-art TSC algorithm Rocket, when adapted for regression, achieves the highest overall accuracy compared to adaptations of other TSC algorithms and state-of-the-art machine learning (ML) algorithms such as XGBoost, Random Forest and Support Vector Regression. More importantly, we show that much research is needed in this field to improve the accuracy of ML models. We also find evidence that further research has excellent prospects of improving upon these straightforward baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://www.kaggle.com/vikrishnan/boston-house-prices.

  2. https://github.com/robjhyndman/tsfeatures.

  3. https://xgboost.readthedocs.io/en/latest/python/python_intro.html.

  4. https://fda.readthedocs.io/en/latest/.

  5. https://github.com/hfawaz/dl-4-tsc.

  6. https://github.com/hfawaz/InceptionTime.

  7. https://github.com/angus924/rocket.

References

  • Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-series classification with COTE: the collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535

    Article  Google Scholar 

  • Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660

    Article  MathSciNet  Google Scholar 

  • Baydogan MG, Runger G (2015) Learning a symbolic representation for multivariate time series classification. Data Min Knowl Discov 29(2):400–422

    Article  MathSciNet  Google Scholar 

  • Box GE, Jenkins GM (1970) Time series analysis forecasting and control. Tech. rep., Wisconsin University, Dept of Statistics

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Chatfield C (1978) The Holt-Winters forecasting procedure. J R Stat Soc Ser C (Appl Stat) 27(3):264–279

    Google Scholar 

  • Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    Article  MATH  Google Scholar 

  • Dau HA, Bagnall A, Kamgar K, Yeh CCM, Zhu Y, Gharghabi S, Ratanamahatana CA, Keogh E (2019) The UCR time series archive. IEEE/CAA J Autom Sin 6(6):1293–1305

    Article  Google Scholar 

  • De Vito S, Massera E, Piga M, Martinotto L, Di Francia G (2008) On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario. Sens Actuators B Chem 129(2):750–757

    Article  Google Scholar 

  • Dempster A, Petitjean F, Webb GI (2020) ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Discov 34(5):1454–1495

    Article  MathSciNet  Google Scholar 

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  • Deng H, Runger G, Tuv E, Vladimir M (2013) A time series forest for classification and feature extraction. Inf Sci 239:142–153

    Article  MathSciNet  Google Scholar 

  • Drucker H, Burges CJ, Kaufman L, Smola AJ, Vapnik V (1997) Support vector regression machines. In: Advances in neural information processing systems, pp 155–161

  • Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml

  • Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller PA (2018) Transfer learning for time series classification. In: Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), pp 1367–1376

  • Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller PA (2019) Deep learning for time series classification: a review. Data Min Knowl Discov 33(4):917–963

    Article  MathSciNet  Google Scholar 

  • Fawaz HI, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, Webb GI, Idoumghar L, Muller PA, Petitjean F (2020) Inceptiontime: finding alexnet for time series classification. Data Min Knowl Discov 34(6):1936–1962

    Article  MathSciNet  Google Scholar 

  • Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92

    Article  MathSciNet  Google Scholar 

  • Fulcher BD, Little MA, Jones NS (2013) Highly comparative time-series analysis: the empirical structure of time series and their methods. J R Soc Interface 10(83):20130048. https://doi.org/10.1098/rsif.2013.0048

    Article  Google Scholar 

  • Gardner ES Jr (1985) Exponential smoothing: the state of the art. J Forecast 4(1):1–28

    Article  Google Scholar 

  • Goldsmith J, Scheipl F (2014) Estimator selection and combination in scalar-on-function regression. Comput Stat Data Anal 70:362–372

    Article  MathSciNet  Google Scholar 

  • Grabocka J, Schilling N, Wistuba M, Schmidt-Thieme L (2014) Learning time-series shapelets. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 392–401

  • Hyndman R (2018) A brief history of time series forecasting competitions

  • Hyndman R, Koehler AB, Ord JK, Snyder RD (2008) Forecasting with exponential smoothing: the state space approach. Springer, Berlin

    Book  Google Scholar 

  • Kang Y, Hyndman RJ, Smith-Miles K (2017) Visualising forecasting algorithm performance using time series instance spaces. Int J Forecast 33(2):345–358. https://doi.org/10.1016/j.ijforecast.2016.09.004

    Article  Google Scholar 

  • Karlen W, Turner M, Cooke E, Dumont G, Ansermino JM (2010) Capnobase: signal database and tools to collect, share and annotate respiratory signals. In: Annual meeting of the Society for Technology in Anesthesia (STA), West Palm Beach, p 25

  • Lin J, Khade R, Li Y (2012) Rotation-invariant similarity in time series using bag-of-patterns representation. J Intell Inf Syst 39(2):287–315

    Article  Google Scholar 

  • Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 29(3):565–592

    Article  MathSciNet  Google Scholar 

  • Lines J, Davis LM, Hills J, Bagnall A (2012) A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 289–297

  • Lines J, Taylor S, Bagnall A (2016) HIVE-COTE: the hierarchical vote collective of transformation-based ensembles for time series classification. In: Proceedings of the 16th IEEE International Conference on Data Mining (ICDM), pp 1041–1046

  • Lubba CH, Sethi SS, Knaute P, Schultz SR, Fulcher BD, Jones NS (2019) catch22: canonical time-series characteristics. Data Min Knowl Discov 33(6):1821–1852. https://doi.org/10.1007/s10618-019-00647-x

    Article  Google Scholar 

  • Lucas B, Shifaz A, Pelletier C, O’Neill L, Zaidi N, Goethals B, Petitjean F, Webb GI (2019) Proximity forest: an effective and scalable distance-based classifier for time series. Data Min Knowl Discov 33(3):607–635

    Article  Google Scholar 

  • Makridakis S, Hibon M (2000) The M3-competition: results, conclusions and implications. Int J Forecast 16(4):451–476

    Article  Google Scholar 

  • Makridakis S, Andersen A, Carbone R, Fildes R, Hibon M, Lewandowski R, Newton J, Parzen E, Winkler R (1982) The accuracy of extrapolation (time series) methods: results of a forecasting competition. J Forecast 1(2):111–153

    Article  Google Scholar 

  • Makridakis S, Spiliotis E, Assimakopoulos V (2018) The M4 competition: results, findings, conclusion and way forward. Int J Forecast 34(4):802–808

    Article  Google Scholar 

  • Makridakis S, Spiliotis E, Assimakopoulos V (2020) The M4 competition: 100,000 time series and 61 forecasting methods. Int J Forecast 36(1):54–74

    Article  Google Scholar 

  • Meredith DJ, Clifton D, Charlton P, Brooks J, Pugh C, Tarassenko L (2012) Photoplethysmographic derivation of respiratory rate: a review of relevant physiology. J Med Eng Technol 36(1):1–7

    Article  Google Scholar 

  • Moniz N, Torgo L (2018) Multi-source social feedback of online news feeds. arXiv preprint arXiv:1801.07055

  • Montero-Manso P, Athanasopoulos G, Hyndman RJ, Talagala TS (2020) Fforma: feature-based forecast model averaging. Int J Forecast 36(1):86–92. https://doi.org/10.1016/j.ijforecast.2019.02.011

    Article  Google Scholar 

  • Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1154–1162

  • Nielsen D (2016) Tree boosting with xgboost-why does xgboost win every machine learning competition? Master’s thesis, NTNU

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Pelletier C, Webb GI, Petitjean F (2019) Temporal convolutional neural network for the classification of satellite image time series. Remote Sens 11(5):523

    Article  Google Scholar 

  • Pimentel MA, Charlton PH, Clifton DA (2015) Probabilistic estimation of respiratory rate from wearable sensors. In: Wearable electronics sensors. Springer, pp 241–262

  • Pimentel MA, Johnson AE, Charlton PH, Birrenkott D, Watkinson PJ, Tarassenko L, Clifton DA (2016) Toward a robust estimation of respiratory rate from pulse oximeters. IEEE Trans Biomed Eng 64(8):1914–1923

    Article  Google Scholar 

  • Rakthanmanon T, Keogh E (2013) Fast shapelets: a scalable algorithm for discovering time series shapelets. In: Proceedings of the 2013 SIAM International Conference on Data Mining (SDM). SIAM, pp 668–676

  • Reiss PT, Goldsmith J, Shang HL, Ogden RT (2017) Methods for scalar-on-function regression. Int Stat Rev 85(2):228–249

    Article  MathSciNet  Google Scholar 

  • Reiss A, Indlekofer I, Schmidt P, Van Laerhoven K (2019) Deep PPG: large-scale heart rate estimation with convolutional neural networks. Sensors 19(14):3079

    Article  Google Scholar 

  • Salehizadeh S, Dao D, Bolkhovsky J, Cho C, Mendelson Y, Chon KH (2016) A novel time-varying spectral filtering algorithm for reconstruction of motion artifact corrupted heart rate signals during intense physical activities using a wearable photoplethysmogram sensor. Sensors 16(1):10

    Article  Google Scholar 

  • Sammut C, Webb GI (2011) Encyclopedia of machine learning. Springer, Berlin

    MATH  Google Scholar 

  • Schäck T, Muma M, Zoubir AM (2017) Computationally efficient heart rate estimation during physical exercise using photoplethysmographic signals. In: 2017 25th European Signal Processing Conference (EUSIPCO). IEEE, pp 2478–2481

  • Schäfer P (2015) The BOSS is concerned with time series classification in the presence of noise. Data Min Knowl Discov 29(6):1505–1530

    Article  MathSciNet  Google Scholar 

  • Schäfer P, Leser U (2017a) Fast and accurate time series classification with weasel. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 637–646

  • Schäfer P, Leser U (2017b) Multivariate time series classification with WEASEL+MUSE. arXiv preprint arXiv:1711.11343

  • Segal MR (2004) Machine learning benchmarks and random forest regression. UCSF: Center for Bioinformatics and Molecular Biostatistics

  • Senin P, Malinchik S (2013) SAX-VSM: interpretable time series classification using SAX and vector space model. In: 2013 IEEE 13th international conference on data mining. IEEE, pp 1175–1180

  • Shokoohi-Yekta M, Hu B, Jin H, Wang J, Keogh E (2017) Generalizing DTW to the multi-dimensional case requires an adaptive approach. Data Min Knowl Discov 31(1):1–31

    Article  MathSciNet  Google Scholar 

  • Tan CW, Herrmann M, Forestier G, Webb GI, Petitjean F (2018) Efficient search of the best warping window for dynamic time warping. In: Proceedings of the 2018 SIAM International Conference on Data Mining (SDM). SIAM, pp 225–233

  • Tan CW, Bergmeir C, Petitjean F, Webb GI (2020a) Monash University, UEA, UCR time series extrinsic regression archive. arXiv preprint arXiv:2006.10996

  • Tan CW, Petitjean F, Webb GI (2020b) FastEE: fast ensembles of elastic distances for time series classification. Data Min Knowl Discov 34(1):231–272

    Article  MathSciNet  Google Scholar 

  • Wang Z, Yan W, Oates T (2017) Time series classification from scratch with deep neural networks: a strong baseline. In: 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 1578–1585

  • Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pp 947–956

  • Yebra M, Quan X, Riaño D, Larraondo PR, van Dijk AI, Cary GJ (2018) A fuel moisture content and flammability monitoring methodology for continental Australia based on optical remote sensing. Remote Sens Environ 212:260–272

    Article  Google Scholar 

  • Zhang Z (2015) Photoplethysmography-based heart rate monitoring in physical activities via joint sparse spectrum reconstruction. IEEE Trans Biomed Eng 62(8):1902–1910

    Article  Google Scholar 

  • Zhang Z, Pi Z, Liu B (2014) Troika: a general framework for heart rate monitoring using wrist-type photoplethysmographic signals during intensive physical exercise. IEEE Trans Biomed Eng 62(2):522–531

    Article  Google Scholar 

Download references

Acknowledgements

This research has been supported by Australian Research Council grant DP210100072; and the Air Force Office of Scientific Research, Asian Office of Aerospace Research and Development (AOARD) under award number FA2386-18-1-4030. The authors appreciate the data donation from all the donors and would like to thank the authors of Fawaz et al. (2019) and Dempster et al. (2020) for providing their source code online.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chang Wei Tan.

Additional information

Responsible editor: Eamonn Keogh.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, C.W., Bergmeir, C., Petitjean, F. et al. Time series extrinsic regression. Data Min Knowl Disc 35, 1032–1060 (2021). https://doi.org/10.1007/s10618-021-00745-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-021-00745-9

Keywords