Visual-Based Positioning and Pose Estimation

Phon-Amnuaisuk, Somnuk; Murata, Ken T.; Kovavisaruch, La-Or; Lim, Tiong-Hoo; Pavarangkoon, Praphan; Mizuhara, Takamichi

doi:10.1007/978-3-030-63820-7_68

Somnuk Phon-Amnuaisuk ORCID: orcid.org/0000-0003-2130-185X^11,12,
Ken T. Murata¹³,
La-Or Kovavisaruch¹⁴,
Tiong-Hoo Lim¹¹,
Praphan Pavarangkoon¹³ &
…
Takamichi Mizuhara¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1332))

Included in the following conference series:

International Conference on Neural Information Processing

2508 Accesses
3 Citations

Abstract

Recent advances in deep learning and computer vision offer an excellent opportunity to investigate high-level visual analysis tasks such as human localization and human pose estimation. Although the performances of human localization and human pose estimation have significantly improved in recent reports, they are not perfect, and erroneous estimation of position and pose can be expected among video frames. Studies on the integration of these techniques into a generic pipeline robust to those errors are still lacking. This paper fills the missing study. We explored and developed two working pipelines that suited visual-based positioning and pose estimation tasks. Analyses of the proposed pipelines were conducted on a badminton game. We showed that the concept of tracking by detection could work well, and errors in position and pose could be effectively handled by linear interpolation of information from nearby frames. The results showed that the Visual-based Positioning and Pose Estimation could deliver position and pose estimations with good spatial and temporal resolutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Learning-Based 2D and 3D Human Pose Estimation: A Survey

Visual localization under appearance change: filtering approaches

Article 17 September 2020

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Article Open access 04 March 2021

Notes

1.
Action analysis based on a skeleton figure, i.e., a stick man figure.
2.
AR/VR applications could either track the position of a head mounted display unit in a 3D world space using external sensors (outside-in) or using internal sensors (inside-out) equipped on the head-mounted display device.
3.
Matterport.

References

He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. arXiv:1703.06870v3 (2018)
Laptev, I., Lindeberg, T.: Space-time interest points. In: Proceedings of the 9th IEEE International Conference on Computer Vision, Nice, France, vol. 1, pp. 432–439 (2003)
Google Scholar
Mo, L., Li, F., Zhu, Y., Huang, A.: Human physical activity recognition based on computer vision with deep learning model. In: Proceedings of the IEEE International Conference on Instrumentation and Measurement Technology, Taipei, Taiwan, pp. 1–6 (2016)
Google Scholar
Kojima, A., Tamura, T., Fukunaga, K.: Natural language description of human activities from video images based on concept hierarchical of actions. Int. J. Comput. Vis. 50(2), 171–184 (2002)
Article Google Scholar
Xu, D., Zhu, Y., Choy, C.B., Li, F.F.: Scene graph generation by iterative message passing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (CVPR 2017), pp. 5410–5419 (2017)
Google Scholar
Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2011), pp. 1297–1304 (2011)
Google Scholar
Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 1297–1304 (2011)
Google Scholar
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. arXiv:1411.4280v3 (2015)
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machine. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2016) (2016)
Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Tome, D., Russell, C., Agapito, L.: Lifting from the deep: convolutional 3D pose estimation from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017) (2017)
Google Scholar
Kudo, Y., Ogaki, K., Matsui, Y., Odagiri, Y.: Unsupervised adversarial learning of 3D human pose from 2D joint locations. arXiv:1803.08244v1 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the International Conference on Advances in Neural Information Processing Systems (NIPS), pp. 91–99 (2015)
Google Scholar
Phon-Amnuaisuk, S., Murata, K.T., Pavarangkoon, P., Mizuhara, T., Hadi, S.: Children activity descriptions from visual and textual associations. In: Chamchong, R., Wong, K.W. (eds.) MIWAI 2019. LNCS (LNAI), vol. 11909, pp. 121–132. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33709-4_11
Chapter Google Scholar

Download references

Acknowledgments

This publication is the output of the ASEAN IVO (http://www.nict.go.jp/en/asean_ivo/index.html) project titled Event Analysis: Applications of computer vision and AI in smart tourism industry and financially supported by NICT (http://www.nict.go.jp/en/index.html). We would also like to thank anonymous reviewers for their constructive comments and suggestions.

Author information

Authors and Affiliations

Media Informatics Special Interest Group, CIE, Universiti Teknologi Brunei, Gadong, Brunei
Somnuk Phon-Amnuaisuk & Tiong-Hoo Lim
School of Computing and Information Technology, Universiti Teknologi Brunei, Gadong, Brunei
Somnuk Phon-Amnuaisuk
National Institute of Information and Communications Technology, Tokyo, Japan
Ken T. Murata & Praphan Pavarangkoon
National Electronics and Computer Technology Center (NECTEC), Khlong Luang, Thailand
La-Or Kovavisaruch
CLEALINKTECHNOLOGY Co., Ltd., Kyoto, Japan
Takamichi Mizuhara

Authors

Somnuk Phon-Amnuaisuk
View author publications
You can also search for this author in PubMed Google Scholar
Ken T. Murata
View author publications
You can also search for this author in PubMed Google Scholar
La-Or Kovavisaruch
View author publications
You can also search for this author in PubMed Google Scholar
Tiong-Hoo Lim
View author publications
You can also search for this author in PubMed Google Scholar
Praphan Pavarangkoon
View author publications
You can also search for this author in PubMed Google Scholar
Takamichi Mizuhara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Somnuk Phon-Amnuaisuk .

Editor information

Editors and Affiliations

Department of AI, Ping An Life, Shenzhen, China
Haiqin Yang
Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi-Sing Leung
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
James T. Kwok
School of Information Technology, King Mongkut's University of Technology Thonburi, Bangkok, Thailand
Jonathan H. Chan
The Chinese University of Hong Kong, New Territories, Hong Kong
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Phon-Amnuaisuk, S., Murata, K.T., Kovavisaruch, LO., Lim, TH., Pavarangkoon, P., Mizuhara, T. (2020). Visual-Based Positioning and Pose Estimation. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Communications in Computer and Information Science, vol 1332. Springer, Cham. https://doi.org/10.1007/978-3-030-63820-7_68

Download citation

DOI: https://doi.org/10.1007/978-3-030-63820-7_68
Published: 17 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63819-1
Online ISBN: 978-3-030-63820-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Visual-Based Positioning and Pose Estimation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning-Based 2D and 3D Human Pose Estimation: A Survey

Visual localization under appearance change: filtering approaches

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Visual-Based Positioning and Pose Estimation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning-Based 2D and 3D Human Pose Estimation: A Survey

Visual localization under appearance change: filtering approaches

Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation