Abstract
Falls frequently present substantial safety hazards to those who are alone, particularly the elderly. Deploying a rapid and proficient method for detecting falls is a highly effective approach to tackle this concealed peril. The majority of existing fall detection methods rely on either visual data or wearable devices, both of which have drawbacks. This research presents a multimodal approach that integrates video and audio modalities to address the issue of fall detection systems and enhances the accuracy of fall detection in challenging environmental conditions. This multimodal approach, which leverages the benefits of attention mechanism in both video and audio streams, utilizes features from both modalities through feature-level fusion to detect falls in unfavorable conditions where visual systems alone are unable to do so. We assessed the performance of our multimodal fall detection model using Le2i and UP-Fall datasets. Additionally, we compared our findings with other fall detection methods. The outstanding results of our multimodal model indicate its superior performance compared to single fall detection models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amsaprabhaa, M., et al.: Multimodal spatiotemporal skeletal kinematic gait feature fusion for vision-based fall detection. Expert Syst. Appl. 212, 118681 (2023)
Apicella, A., Snidaro, L.: Deep neural networks for real-time remote fall detection. In: Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Part II. pp. 188–201. Springer (2021)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Carneiro, S.A., da Silva, G.P., Leite, G.V., Moreno, R., Guimaraes, S.J.F., Pedrini, H.: Multi-stream deep convolutional network using high-level features applied to fall detection in video sequences. In: 2019 International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 293–298. IEEE (2019)
Chamle, M., Gunale, K., Warhade, K.: Automated unusual event detection in video surveillance. In: 2016 International Conference on Inventive Computation Technologies (ICICT), vol. 2, pp. 1–4. IEEE (2016)
Charfi, I., Miteran, J., Dubois, J., Atri, M., Tourki, R.: Definition and performance evaluation of a robust SVM based fall detection solution. In: 2012 Eighth International Conference on Signal Image Technology and Internet Based Systems, pp. 218–224. IEEE (2012)
Cheffena, M.: Fall detection using smartphone audio features. IEEE J. Biomed. Health Inform. 20(4), 1073–1080 (2015)
Dai, B., Yang, D., Ai, L., Zhang, P.: A novel video-surveillance-based algorithm of fall detection. In: 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1–6. IEEE (2018)
Dai, W., Dai, C., Qu, S., Li, J., Das, S.: Very deep convolutional neural networks for raw waveforms. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 421–425. IEEE (2017)
D’mello, S.K., Kory, J.: A review and meta-analysis of multimodal affect detection systems. ACM Comput. Surv. (CSUR) 47(3), 1–36 (2015)
Fei, K., Wang, C., Zhang, J., Liu, Y., Xie, X., Tu, Z.: Flow-pose net: an effective two-stream network for fall detection. Vis. Comput. 39(6), 2305–2320 (2023)
Golik, P., Tüske, Z., Schlüter, R., Ney, H.: Convolutional neural networks for acoustic modeling of raw time signal in LVCSR. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)
Gonzalez, R.C.: Digital image processing. Pearson education India (2009)
Harrou, F., Zerrouki, N., Sun, Y., Houacine, A.: An integrated vision-based approach for efficient human fall detection in a home environment. IEEE Access 7, 114966–114974 (2019)
Hasan, M.M., Islam, M.S., Abdullah, S.: Robust pose-based human fall detection using recurrent neural network. In: 2019 IEEE International Conference on Robotics, Automation, Artificial-intelligence and Internet-of-Things (RAAICON), pp. 48–51. IEEE (2019)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Inturi, A.R., Manikandan, V., Garrapally, V.: A novel vision-based fall detection scheme using keypoints of human skeleton with long short-term memory network. Arab. J. Sci. Eng. 48(2), 1143–1155 (2023)
Jiao, S., Li, G., Zhang, G., Zhou, J., Li, J.: Multimodal fall detection for solitary individuals based on audio-video decision fusion processing. Heliyon 10(8) (2024)
Kaur, P., Wang, Q., Shi, W.: Fall detection from audios with audio transformers. Smart Health 26, 100340 (2022)
Kwolek, B., Kepski, M.: Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput. Methods Programs Biomed. 117(3), 489–501 (2014)
Kwolek, B., Kepski, M.: Improving fall detection by the use of depth sensor and accelerometer. Neurocomputing 168, 637–645 (2015)
Li, K., et al.: Uniformer: unifying convolution and self-attention for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 45(10), 12581–12600 (2023)
Lu, X., et al.: Three-dimensional physical and optical characteristics of aerosols over central china from long-term calipso and hysplit data. Remote Sens. 10(2), 314 (2018)
Núñez-Marcos, A., Arganda-Carreras, I.: Transformer-based fall detection in videos. Eng. Appl. Artif. Intell. 132, 107937 (2024)
Poonsri, A., Chiracharit, W.: Fall detection using gaussian mixture model and principle component analysis. In: 2017 9th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 1–4. IEEE (2017)
Poonsri, A., Chiracharit, W.: Improvement of fall detection using consecutive-frame voting. In: 2018 International Workshop on Advanced Image Technology (IWAIT), pp. 1–4. IEEE (2018)
Popescu, M., Mahnot, A.: Acoustic fall detection using one-class classifiers. In: 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 3505–3508. IEEE (2009)
Poria, S., Cambria, E., Bajpai, R., Hussain, A.: A review of affective computing: from unimodal analysis to multimodal fusion. Inf. Fusion 37, 98–125 (2017)
Pratt, W.K.: Digital Image Processing: PIKS Scientific Inside, vol. 4. Wiley Online Library (2007)
Shokrollahi, A., Persson, J.A., Malekian, R., Sarkheyli-Hägele, A., Karlsson, F.: Passive infrared sensor-based occupancy monitoring in smart buildings: a review of methodologies and machine learning approaches. Sensors 24(5), 1533 (2024)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Wang, B.H., Yu, J., Wang, K., Bao, X.Y., Mao, K.M.: Fall detection based on dual-channel feature integration. IEEE Access 8, 103443–103453 (2020)
Wang, K., Cao, G., Meng, D., Chen, W., Cao, W.: Automatic fall detection of human in video using combination of features. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1228–1233. IEEE (2016)
Wang, X., Ellul, J., Azzopardi, G.: Elderly fall detection systems: a literature survey. Front. Robot. AI 7, 71 (2020)
Wang, Y., et al.: Multi-modal 3d object detection in autonomous driving: a survey. Int. J. Comput. Vision 131(8), 2122–2152 (2023)
World Health Organization: Falls. https://www.who.int/news-room/fact-sheets/detail/falls (2024). [Accessed 04 June 2024]
Youssfi Alaoui, A., Tabii, Y., Oulad Haj Thami, R., Daoudi, M., Berretti, S., Pala, P.: Fall detection of elderly people using the manifold of positive semidefinite matrices. J. Imaging 7(7), 109 (2021)
Yu, M., Gong, L., Kollias, S.: Computer vision based fall detection by a convolutional neural network. In: Proceedings of the 19th ACM international conference on multimodal interaction, pp. 416–420 (2017)
Zheng, H., et al.: Lightweight fall detection algorithm based on Alphapose optimization model and ST-GCN. Math. Probl. Eng. 2022 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jamali, M. et al. (2025). Video-Audio Multimodal Fall Detection Method. In: Hadfi, R., Anthony, P., Sharma, A., Ito, T., Bai, Q. (eds) PRICAI 2024: Trends in Artificial Intelligence. PRICAI 2024. Lecture Notes in Computer Science(), vol 15284. Springer, Singapore. https://doi.org/10.1007/978-981-96-0125-7_6
Download citation
DOI: https://doi.org/10.1007/978-981-96-0125-7_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0124-0
Online ISBN: 978-981-96-0125-7
eBook Packages: Computer ScienceComputer Science (R0)