DPHM-Net:de-redundant multi-period hybrid modeling network for long-term series forecasting | World Wide Web Skip to main content
Log in

DPHM-Net:de-redundant multi-period hybrid modeling network for long-term series forecasting

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Deep learning models have been widely applied in the field of long-term forecasting has achieved significant success, with the incorporation of inductive bias such as periodicity to model multi-granularity representations of time series being a commonly employed design approach in forecasting methods. However, existing methods still face challenges related to information redundancy during the extraction of inductive bias and the learning process for multi-granularity features. The presence of redundant information can impede the acquisition of a comprehensive temporal representation by the model, thereby adversely impacting its predictive performance. To address the aforementioned issues, we propose a De-Redundant Multi-Period Hybrid Modeling Network (DPHM-Net) that effectively eliminates redundant information from the series inductive bias extraction mechanism and the multi-granularity series features in the time series representation learning. In DPHM-Net, we propose an efficient time series representation learning process based on a period inductive bias and introduce the concept of de-redundancy among multiple time series into the representation learning process for single time series. Additionally, we design a specialized gated unit to dynamically balance the elimination weights between series features and redundant semantic information. The advanced performance and high efficiency of our method in long-term forecasting tasks against previous state-of-the-art are demonstrated through extensive experiments on real-world datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

(https://github.com/thuml/TimesNet).

References

  1. Song, H., Rajan, D., Thiagarajan, J., Spanias, A.: Attend and diagnose: Clinical time series analysis using attention models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

  2. Patton, A.: Copula methods for forecasting multivariate time series. Handbook of economic forecasting 2, 899–960 (2013)

    Article  Google Scholar 

  3. Angryk, R.A., Martens, P.C., Aydin, B., Kempton, D., Mahajan, S.S., Basodi, S., Ahmadzadeh, A., Cai, X., Filali Boubrahimi, S., Hamdi, S.M., et al.: Multivariate time series dataset for space weather data analytics. Scientific data 7(1), 227 (2020)

    Article  Google Scholar 

  4. Demirel, Ö.F., Zaim, S., Çalişkan, A., Özuyar, P Forecasting natural gas consumption in istanbul using neural networks and multivariate time series methods. Turkish J. Electr. Eng. Comput. Sci. 20(5), 695–711 (2012)

  5. Wang, H., Peng, J., Huang, F., Wang, J., Chen, J., Xiao, Y.: Micn: Multi-scale local and global context modeling for long-term series forecasting. In: The Eleventh International Conference on Learning Representations (2022)

  6. Zhang, Y., Yan, J.: Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In: The Eleventh International Conference on Learning Representations (2022)

  7. Nie, Y., Nguyen, N.H., Sinthong, P., Kalagnanam, J.: A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730 (2022)

  8. Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L., Jin, R.: Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In: International Conference on Machine Learning, pp. 27268–27286 (2022). PMLR

  9. Wu, H., Xu, J., Wang, J., Long, M.: Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Adv. Neural. Inf. Process. Syst. 34, 22419–22430 (2021)

    Google Scholar 

  10. Wu, H., Hu, T., Liu, Y., Zhou, H., Wang, J., Long, M.: Timesnet: Temporal 2d-variation modeling for general time series analysis. arXiv preprint arXiv:2210.02186 (2022)

  11. Borovykh, A., Bohte, S., Oosterlee, C.W.: Conditional time series forecasting with convolutional neural networks. arXiv preprint arXiv:1703.04691 (2017)

  12. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  13. Sen, R., Yu, H.-F., Dhillon, I.S.: Think globally, act locally: A deep neural network approach to high-dimensional time series forecasting. Advances in neural information processing systems 32 (2019)

  14. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)

  15. Tenney, I., Das, D., Pavlick, E.: Bert rediscovers the classical nlp pipeline. arXiv preprint arXiv:1905.05950 (2019)

  16. Kang, W.-C., McAuley, J.: Self-attentive sequential recommendation. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 197–206 (2018). IEEE

  17. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  18. Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., Zhang, W.: Informer: Beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11106–11115 (2021)

  19. Kitaev, N., Kaiser, Ł., Levskaya, A.: Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451 (2020)

  20. Woo, G., Liu, C., Sahoo, D., Kumar, A., Hoi, S.: Etsformer: Exponential smoothing transformers for time-series forecasting. arXiv preprint arXiv:2202.01381 (2022)

  21. Zeng, A., Chen, M., Zhang, L., Xu, Q.: Are transformers effective for time series forecasting? In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 11121–11128 (2023)

  22. Gardner, E.S., Jr.: Exponential smoothing: The state of the art. J. Forecast. 4(1), 1–28 (1985)

    Article  MathSciNet  Google Scholar 

  23. Gardner, E.S., Jr.: Exponential smoothing: The state of the —part ii. Int. J. Forecast. 22(4), 637–666 (2006)

  24. Bartholomew, D.J.: Time Series Analysis Forecasting and Control. JSTOR (1971)

  25. Box, G., Jenkins, G., Reinsel, G., Ljung, G.: Time Series Analysis: Forecasting and Control. John Willey and Sons, New Jersey (2016)

    Google Scholar 

  26. Hoerl, A.E., Kennard, R.W.: Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)

    Article  Google Scholar 

  27. Vapnik, V., Golowich, S., Smola, A.: Support vector method for function approximation, regression estimation and signal processing. Advances in neural information processing systems 9 (1996)

  28. Roberts, S., Osborne, M., Ebden, M., Reece, S., Gibson, N., Aigrain, S.: Gaussian processes for time-series modelling. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371(1984), 20110550 (2013)

    Article  MathSciNet  Google Scholar 

  29. Cleveland, R.B., Cleveland, W.S., McRae, J.E., Terpenning, I.: Stl: A seasonal-trend decomposition. J. Off. Stat 6(1), 3–73 (1990)

    Google Scholar 

  30. Bloomfield, P.: Fourier Analysis of Time Series: an Introduction. John Wiley & Sons, (2004)

  31. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  32. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  33. Lai, G., Chang, W.-C., Yang, Y., Liu, H.: Modeling long-and short-term temporal patterns with deep neural networks. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 95–104 (2018)

  34. Temporal pattern attention for multivariate time series forecasting: Shih, S.-Y., Sun, F.-K., Lee, H.-y. Mach. Learn. 108, 1421–1441 (2019)

    MathSciNet  Google Scholar 

  35. Ullah, S., Xu, Z., Wang, H., Menzel, S., Sendhoff, B., Bäck, T.: Exploring clinical time series forecasting with meta-features in variational recurrent models. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–9 (2020). IEEE

  36. Liu, Y., Wu, H., Wang, J., Long, M.: Non-stationary transformers: Rethinking the stationarity in time series forecasting. arXiv preprint arXiv:2205.14415 (2022)

  37. Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y.: N-beats: Neural basis expansion analysis for interpretable time series forecasting. arXiv preprint arXiv:1905.10437 (2019)

  38. Challu, C., Olivares, K.G., Oreshkin, B.N., Ramirez, F.G., Canseco, M.M., Dubrawski, A.: Nhits: Neural hierarchical interpolation for time series forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 6989–6997 (2023)

  39. Zhang, T., Zhang, Y., Cao, W., Bian, J., Yi, X., Zheng, S., Li, J.: Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186 (2022)

  40. Hou, M., Xu, C., Li, Z., Liu, Y., Liu, W., Chen, E., Bian, J.: Multi-granularity residual learning with confidence estimation for time series prediction. In: Proceedings of the ACM Web Conference 2022, pp. 112–121 (2022)

  41. Johnson, A.E., Pollard, T.J., Shen, L., Lehman, L.-w.H., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Anthony Celi, L., Mark, R.G.: Mimic-iii, a freely accessible critical care database. Scientific data 3( 1), 1–9 (2016)

  42. Harutyunyan, H., Khachatrian, H., Kale, D.C., Ver Steeg, G., Galstyan, A.: Multitask learning and benchmarking with clinical time series data. Scientific Data 6(1), 96 (2019). https://doi.org/10.1038/s41597-019-0103-9

    Article  Google Scholar 

  43. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019)

  44. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Download references

Funding

This work is supported by the National Natural Science Foundation of China (Grant No.62376135).

Author information

Authors and Affiliations

Authors

Contributions

Zheng conducted the creation of model, performed the data curation and analysis, and wrote the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Yuliang Shi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Details of experiments

1.1 A. 1 Evalution metrics

We use the metrics commonly used in prediction tasks, i.e., mean square error(MSE) and mean absolute error(MAE), as performance evaluation metrics, which are defined as follows:

$$\begin{aligned}{} & {} MSE=\frac{1}{n} \sum _{i=1}^{n} {(y_{i}-\hat{y}_{i})^{2} }, \end{aligned}$$
(A1)
$$\begin{aligned}{} & {} MAE=\frac{1}{n} \sum _{i=1}^{n} { \left| y_{i}-\hat{y}_{i} \right| }, \end{aligned}$$
(A2)

where n is the number of samples, \(y_{i}\) is the ground truth and \(\hat{y}_{i}\) is the prediction result.

1.2 A.2 Dataset statistics and hyperparameters

All the experimental Settings are the same as in TimesNet [10]. Table 9 lists detailed information about all the datasets used in the experiments of this paper and the hyperparameter settings for our paper’s method on all datasets, where MIMIC-III is the decompensation task data [42] extracted from the patient No.58242 in the database, \(d_{model}\), \(d_{ff}\), \(e_{layers}\) denotes the dimension of embedding, the demension of hidden representation, the number of encoder layers respectively. The seed is set to 2021.

Table 9 Details and model hyperparameters of datasets

1.3 A.3 Algorithm of multi-period extraction

TimesNet’s multi-period extraction algorithm and DPHM-Net’s multi-period extraction algorithm are presented as follows:

Algorithm 3
figure c

The codes od TimesNet’s multi-period extraction algorithm.

Algorithm 4
figure d

The codes of DPHM-Net’s multi-period extraction algorithm.

In Algorithm 3 and Algorithm 4, x denotes the input series and k denotes the number of periods to be extracted, returning the extracted period lengths and their corresponding amplitude values. Our improved algorithm guarantees the uniqueness of the extracted period lengths.

Appendix B: Showcases of main results

1.1 B.1   Multivariate time series forecasting

As shown in Figures 7, 8, 9, 10, 11, 12, 13 and 14, we plot the forecasting results from test set of multivariate datasets ETTm1 and ETTm2. The results show that the prediction results of our methods are more closely aligned with the trend of the ground truth. Moreover, DPHM-Net and DPHM(G)-Net are better at predicting more detailed localized changes.

Fig. 7
figure 7

Prediction-96 cases from the multivariate ETTm1 dataset

Fig. 8
figure 8

Prediction-192 cases from the multivariate ETTm1 dataset

Fig. 9
figure 9

Prediction-336 cases from the multivariate ETTm1 dataset

Fig. 10
figure 10

Prediction-720 cases from the multivariate ETTm1 dataset

Fig. 11
figure 11

Prediction-96 cases from the multivariate ETTm2 dataset

Fig. 12
figure 12

Prediction-192 cases from the multivariate ETTm2 dataset

Fig. 13
figure 13

Prediction-336 cases from the multivariate ETTm2 dataset

Fig. 14
figure 14

Prediction-720 cases from the multivariate ETTm2 dataset

1.2 B.2 Univariate time series Forecasting

As shown in Figures 15, 16, 17 and 18, we plot the forecasting result from the test set of univariate dataset Electricity. The results show that our method has better prediction accuracy and can predict random changes in the data with sudden upswings or downswings.

Fig. 15
figure 15

Prediction-96 cases from the univariate Electricity dataset

Fig. 16
figure 16

Prediction-192 cases from the univariate Electricity dataset

Fig. 17
figure 17

Prediction-336 cases from the univariate Electricity dataset

Fig. 18
figure 18

Prediction-720 cases from the univariate Electricity dataset

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, C., Shi, Y., Lee, W. et al. DPHM-Net:de-redundant multi-period hybrid modeling network for long-term series forecasting. World Wide Web 27, 40 (2024). https://doi.org/10.1007/s11280-024-01281-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11280-024-01281-4

Keywords