Features Fusion Framework for Multimodal Irregular Time-series Events | SpringerLink
Skip to main content

Features Fusion Framework for Multimodal Irregular Time-series Events

  • Conference paper
  • First Online:
PRICAI 2022: Trends in Artificial Intelligence (PRICAI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13629))

Included in the following conference series:

Abstract

Some data from multiple sources can be modeled as multimodal time-series events which have different sampling frequencies, data compositions, temporal relations and characteristics. Different types of events have complex nonlinear relationships, and the time of each event is irregular. Neither the classical Recurrent Neural Network (RNN) model nor the current state-of-the-art Transformer model can deal with these features well. In this paper, a features fusion framework for multimodal irregular time-series events is proposed based on the Long Short-Term Memory networks (LSTM). Firstly, the complex features are extracted according to the irregular patterns of different events. Secondly, the nonlinear correlation and complex temporal dependencies relationship between complex features are captured and fused into a tensor. Finally, a feature gate are used to control the access frequency of different tensors. Extensive experiments on MIMIC-III dataset demonstrate that the proposed framework significantly outperforms to the existing methods in terms of AUC (the area under Receiver Operating Characteristic curve) and AP (Average Precision).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 10295
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12869
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Armandpour, M., Kidd, B., Du, Y., Huang, J.Z.: Deep personalized glucose level forecasting using attention-based recurrent neural networks. arXiv preprint arXiv:2106.00884 (2021)

  2. Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: A survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)

    Article  Google Scholar 

  3. Dai, Z., Yang, Z., Yang, Y., Carbonell, J.G., Le, Q.V., Salakhutdinov, R.: Transformer-xl: Attentive language models beyond a fixed-length context. In: ACL (1) (2019)

    Google Scholar 

  4. Fu, Y., Cao, L., Guo, G., Huang, T.S.: Multiple feature fusion by subspace learning. In: Proceedings of the 2008 International Conference on Content-based Image and Video Retrieval, pp. 127–134 (2008)

    Google Scholar 

  5. Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J., et al.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies (2001)

    Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  7. Johnson, A.E., et al.: Mimic-iii, a freely accessible critical care database. Sci. data 3(1), 1–9 (2016)

    Article  Google Scholar 

  8. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. Computer Science (2014)

    Google Scholar 

  9. Koutnik, J., Greff, K., Gomez, F., Schmidhuber, J.: A clockwork rnn. In: International Conference on Machine Learning, pp. 1863–1871. PMLR (2014)

    Google Scholar 

  10. Li, X., et al.: Adversarial multimodal representation learning for click-through rate prediction. In: Proceedings of The Web Conference 2020, pp. 827–836 (2020)

    Google Scholar 

  11. Liu, J., Li, T., Xie, P., Du, S., Teng, F., Yang, X.: Urban big data fusion based on deep learning: an overview. Inform. Fusion 53, 123–133 (2020)

    Article  Google Scholar 

  12. Liu, L., Shen, J., Zhang, M., Wang, Z., Liu, Z.: Deep learning based patient representation learning framework of heterogeneous temporal events data. Big Data Res. 5(1), 2019003 (2019)

    Google Scholar 

  13. Liu, P., et al.: Vara-tts: Non-autoregressive text-to-speech synthesis based on very deep vae with residual attention. arXiv preprint arXiv:2102.06431 (2021)

  14. Liu, Z., Shen, Y., Lakshminarasimhan, V.B., Liang, P.P., Zadeh, A.B., Morency, L.P.: Efficient low-rank multimodal fusion with modality-specific factors. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2018)

    Google Scholar 

  15. Nagrani, A., Yang, S., Arnab, A., Jansen, A., Schmid, C., Sun, C.: Attention bottlenecks for multimodal fusion. In: Advances in Neural Information Processing Systems, vol. 34 (2021)

    Google Scholar 

  16. Neil, D., Pfeiffer, M., Liu, S.C.: Phased lstm: Accelerating recurrent network training for long or event-based sequences. In: NIPS (2016)

    Google Scholar 

  17. Neverova, N., Wolf, C., Taylor, G., Nebout, F.: Moddrop: adaptive multi-modal gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1692–1706 (2015)

    Article  Google Scholar 

  18. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: ICML (2011)

    Google Scholar 

  19. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural. Inf. Process. Syst. 32, 8026–8037 (2019)

    Google Scholar 

  20. Potamianos, G., Neti, C., Gravier, G., Garg, A., Senior, A.W.: Recent advances in the automatic recognition of audiovisual speech. Proc. IEEE 91(9), 1306–1326 (2003)

    Article  Google Scholar 

  21. Ramesh, A., et al.: Zero-shot text-to-image generation. arXiv preprint arXiv:2102.12092 (2021)

  22. Ren, Z., Wang, Z., Ke, Z., Li, Z.: Wushour\(\cdot \)Silamu: Survey of multimodal data fusion. Comput. Eng. Appl. 57(18), 16 (2021)

    Google Scholar 

  23. Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)

    Google Scholar 

  24. Tsai, Y.H.H., Bai, S., Yamada, M., Morency, L.P., Salakhutdinov, R.: Transformer dissection: An unified understanding for transformer’s attention via the lens of kernel. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4344–4353 (2019)

    Google Scholar 

  25. Turpin, A., Scholer, F.: User performance versus precision measures for simple search tasks. In: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 11–18 (2006)

    Google Scholar 

  26. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing systems, pp. 5998–6008 (2017)

    Google Scholar 

  27. Wu, C., et al.: Visual synthesis pre-training for neural visual world creation. arXiv preprint arXiv:2111.12417 (2021)

  28. Zadeh, A., Chen, M., Poria, S., Cambria, E., Morency, L.P.: Tensor fusion network for multimodal sentiment analysis. arXiv preprint arXiv:1707.07250 (2017)

  29. Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., Zhang, W.: Informer: Beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of AAAI (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianchao Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tang, P., Zhang, X. (2022). Features Fusion Framework for Multimodal Irregular Time-series Events. In: Khanna, S., Cao, J., Bai, Q., Xu, G. (eds) PRICAI 2022: Trends in Artificial Intelligence. PRICAI 2022. Lecture Notes in Computer Science, vol 13629. Springer, Cham. https://doi.org/10.1007/978-3-031-20862-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20862-1_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20861-4

  • Online ISBN: 978-3-031-20862-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics