Abstract
Previous Transformer-based models for multivariate time series forecasting mainly focus on temporal dependence learning and neglect the association between variables. The recent method of adding Attention on spatial (variate) tokens before or after temporal learning effectively makes improvements. However, this method of overall association homogenizes the complexity of different variables and cannot learn accurate spatiotemporal dependencies for them. Also, it brings an increase of great complexity, especially for a large number of variables. We propose a Variable class (Vcls) token for temporal Transformers to make improvements. Through the proposed TLCC-SC module, accurate and inclusive variable categories are produced for the generation of the Vcls token. Therefore, the temporal tokens in Transformer achieve strongly correlated cross-spatiotemporal dependencies from different variables within the same class by Attention to the Vcls token. Our method can effectively make general improvements for temporal Transformers and achieve consistent performance with SOTAs on challenging real-world datasets. The code is available at: https://github.com/Joeland4/Vcls.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Das, A., Kong, W., Leach, A., Sen, R., Yu, R.: Long-term forecasting with tide: time-series dense encoder. arXiv preprint arXiv:2304.08424 (2023)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Kitaev, N., Kaiser, Ł., Levskaya, A.: Reformer: the efficient transformer. arXiv preprint arXiv:2001.04451 (2020)
Li, S., et al.: Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Adv. Neural Inf. Process. Syst. 32 (2019)
Liu, M., et al.: SCINet: time series modeling and forecasting with sample convolution and interaction. Adv. Neural. Inf. Process. Syst. 35, 5816–5828 (2022)
Liu, Y., et al.: iTransformer: inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625 (2023)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 14 (2001)
Nie, Y., Nguyen, N.H., Sinthong, P., Kalagnanam, J.: A time series is worth 64 words: long-term forecasting with transformers. arXiv preprint arXiv:2211.14730 (2022)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Wang, Y., et al.: Fully-connected spatial-temporal graph for multivariate time series data. arXiv preprint arXiv:2309.05305 (2023)
Wiener, N.: Generalized harmonic analysis. Acta mathematica 55(1), 117–258 (1930)
Wu, H., Hu, T., Liu, Y., Zhou, H., Wang, J., Long, M.: TimesNet: temporal 2d-variation modeling for general time series analysis. In: The Eleventh International Conference on Learning Representations (2022)
Wu, H., Xu, J., Wang, J., Long, M.: Autoformer: decomposition transformers with auto-correlation for long-term series forecasting. Adv. Neural. Inf. Process. Syst. 34, 22419–22430 (2021)
Zeng, A., Chen, M., Zhang, L., Xu, Q.: Are transformers effective for time series forecasting? In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 11121–11128 (2023)
Zhang, Y., Yan, J.: Crossformer: transformer utilizing cross-dimension dependency for multivariate time series forecasting. In: The Eleventh International Conference on Learning Representations (2022)
Zhou, H., et al.: Informer: beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11106–11115 (2021)
Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L., Jin, R.: Fedformer: frequency enhanced decomposed transformer for long-term series forecasting. In: International Conference on Machine Learning, pp. 27268–27286. PMLR (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cai, T., Wu, H., Niu, D., Xia, X., Jiang, J., Xu, J. (2024). One Process Spatiotemporal Learning of Transformers via Vcls Token for Multivariate Time Series Forecasting. In: Wand, M., Malinovská, K., Schmidhuber, J., Tetko, I.V. (eds) Artificial Neural Networks and Machine Learning – ICANN 2024. ICANN 2024. Lecture Notes in Computer Science, vol 15021. Springer, Cham. https://doi.org/10.1007/978-3-031-72347-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-72347-6_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72346-9
Online ISBN: 978-3-031-72347-6
eBook Packages: Computer ScienceComputer Science (R0)