Space-Time Memory Networks for Multi-person Skeleton Body Part Detection

Dufour, Rémi; Meurie, Cyril; Lézoray, Olivier; Mahtani, Ankur

doi:10.1007/978-3-031-09282-4_7

Rémi Dufour¹²,
Cyril Meurie^12,13,
Olivier Lézoray^12,14 &
…
Ankur Mahtani¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13364))

Included in the following conference series:

International Conference on Pattern Recognition and Artificial Intelligence

1181 Accesses

Abstract

Deep CNNs have recently led to new standards in all fields of computer vision with specialized architectures for most challenges, including Video Object Segmentation and Pose Tracking. We extend Space-Time Memory Networks for the simultaneous detection of multiple object parts. This enables the detection of human body parts for multiple persons in videos. Results in terms of F1-score are satisfactory (a score of 47.6 with the best configuration evaluated on PoseTrack18 datatset) and encouraging for follow-up work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 17159; Price includes VAT (Japan)

Softcover Book: JPY 21449; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

DeepFlux for Skeleton Detection in the Wild

Article 30 January 2021

Learning to Detect and Track Visible and Occluded Body Joints in a Virtual World

Multi-Human Pose Estimation by Deep Learning-Based Sequential Approach for Human Keypoint Position and Human Body Detection

Article 28 October 2023

References

Andriluka, M., et al.: PoseTrack: a benchmark for human pose estimation and tracking. In: CVPR, pp. 5167–5176 (2018)
Google Scholar
Belagiannis, V., Zisserman, A.: Recurrent human pose estimation. In: FG, pp. 468–475 (2017)
Google Scholar
Bruckert, A., Tavakoli, H.R., Liu, Z., Christie, M., Meur, O.L.: Deep saliency models : the quest for the loss function. Neurocomputing 453, 693–704 (2021)
Article Google Scholar
Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2019)
Article Google Scholar
Doering, A., Iqbal, U., Gall, J.: JointFlow: temporal flow fields for multi person pose estimation. In: BMVC, pp. 261–272 (2018)
Google Scholar
Fieraru, M., Khoreva, A., Pishchulin, L., Schiele, B.: Learning to refine human pose estimation. In: CVPR, pp. 318–327 (2018)
Google Scholar
Girdhar, R., Gkioxari, G., Torresani, L., Paluri, M., Tran, D.: Detect-and-track: efficient pose estimation in videos. In: CVPR, pp. 350–359 (2018)
Google Scholar
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: DeeperCut: a deeper, stronger, and faster multi-person pose estimation model. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 34–50. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_3
Chapter Google Scholar
Jin, S., Liu, W., Ouyang, W., Qian, C.: Multi-person articulated tracking with spatial and temporal embeddings. In: CVPR, pp. 5657–5666 (2019)
Google Scholar
Kreiss, S., Bertoni, L., Alahi, A.: PifPaf: composite fields for human pose estimation. In: CVPR, pp. 11977–11986 (2019)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Miller, A., Fisch, A., Dodge, J., Karimi, A.H., Bordes, A., Weston, J.: Key-value memory networks for directly reading documents. In: EMNLP, pp. 1400–1409 (2016)
Google Scholar
Ning, G., Huang, H.: LightTrack: a generic framework for online top-down human pose tracking. In: CVPR, pp. 4456–4465 (2020)
Google Scholar
Oh, S.W., Lee, J.Y., Xu, N., Kim, S.J.: Video object segmentation using spacetime memory networks. In: ICCV, pp. 9225–9234 (2019)
Google Scholar
Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 Davis challenge on video object segmentation. arXiv:1704.00675 (2017)
Raaj, Y., Idrees, H., Hidalgo, G., Sheikh, Y.: Efficient online multi-person 2D pose tracking with recurrent spatio-temporal affinity fields. In: CVPR, pp. 4620–4628 (2019)
Google Scholar
Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: CVPR, pp. 1653–1660 (2014)
Google Scholar
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: CVPR, pp. 4724–4732 (2016)
Google Scholar
Xiu, Y., Li, J., Wang, H., Fang, Y., Lu, C.: Pose flow: efficient online pose tracking. In: BMVC, pp. 53–64 (2018)
Google Scholar
Xu, N., et al.: Youtube-VOS: A large-scale video object segmentation benchmark. arXiv:1809.03327 (2018)

Download references

Acknowledgments

This research work contributes to the french collaborative project TASV (autonomous passengers service train), with SNCF, Alstom Crespin, Thales, Bosch, and SpirOps. It was carried out in the framework of FCS Railenium, Famars and co-financed by the European Union with the European Regional Development Fund (Hauts-de-France region).

Author information

Authors and Affiliations

FCS Railenium, 59300, Famars, France
Rémi Dufour, Cyril Meurie, Olivier Lézoray & Ankur Mahtani
Univ Gustave Eiffel, COSYS-LEOST, 59650, Villeneuve d’Ascq, France
Cyril Meurie
Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, Caen, France
Olivier Lézoray

Authors

Rémi Dufour
View author publications
You can also search for this author in PubMed Google Scholar
Cyril Meurie
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Lézoray
View author publications
You can also search for this author in PubMed Google Scholar
Ankur Mahtani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Rémi Dufour , Cyril Meurie , Olivier Lézoray or Ankur Mahtani .

Editor information

Editors and Affiliations

Télécom SudParis, Palaiseau, France
Mounîm El Yacoubi
École de Technologie Supérieure, Montreal, QC, Canada
Eric Granger
Hong Kong Baptist University, Kowloon, Kowloon, Hong Kong
Pong Chi Yuen
Indian Statistical Institute, Kolkata, India
Umapada Pal
Université Paris Cité, Paris, France
Nicole Vincent

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dufour, R., Meurie, C., Lézoray, O., Mahtani, A. (2022). Space-Time Memory Networks for Multi-person Skeleton Body Part Detection. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2022. Lecture Notes in Computer Science, vol 13364. Springer, Cham. https://doi.org/10.1007/978-3-031-09282-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-09282-4_7
Published: 29 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09281-7
Online ISBN: 978-3-031-09282-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Space-Time Memory Networks for Multi-person Skeleton Body Part Detection