Visual-Based Positioning and Pose Estimation | SpringerLink
Skip to main content

Visual-Based Positioning and Pose Estimation

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2020)

Abstract

Recent advances in deep learning and computer vision offer an excellent opportunity to investigate high-level visual analysis tasks such as human localization and human pose estimation. Although the performances of human localization and human pose estimation have significantly improved in recent reports, they are not perfect, and erroneous estimation of position and pose can be expected among video frames. Studies on the integration of these techniques into a generic pipeline robust to those errors are still lacking. This paper fills the missing study. We explored and developed two working pipelines that suited visual-based positioning and pose estimation tasks. Analyses of the proposed pipelines were conducted on a badminton game. We showed that the concept of tracking by detection could work well, and errors in position and pose could be effectively handled by linear interpolation of information from nearby frames. The results showed that the Visual-based Positioning and Pose Estimation could deliver position and pose estimations with good spatial and temporal resolutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Action analysis based on a skeleton figure, i.e., a stick man figure.

  2. 2.

    AR/VR applications could either track the position of a head mounted display unit in a 3D world space using external sensors (outside-in) or using internal sensors (inside-out) equipped on the head-mounted display device.

  3. 3.

    Matterport.

References

  1. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. arXiv:1703.06870v3 (2018)

  2. Laptev, I., Lindeberg, T.: Space-time interest points. In: Proceedings of the 9th IEEE International Conference on Computer Vision, Nice, France, vol. 1, pp. 432–439 (2003)

    Google Scholar 

  3. Mo, L., Li, F., Zhu, Y., Huang, A.: Human physical activity recognition based on computer vision with deep learning model. In: Proceedings of the IEEE International Conference on Instrumentation and Measurement Technology, Taipei, Taiwan, pp. 1–6 (2016)

    Google Scholar 

  4. Kojima, A., Tamura, T., Fukunaga, K.: Natural language description of human activities from video images based on concept hierarchical of actions. Int. J. Comput. Vis. 50(2), 171–184 (2002)

    Article  Google Scholar 

  5. Xu, D., Zhu, Y., Choy, C.B., Li, F.F.: Scene graph generation by iterative message passing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (CVPR 2017), pp. 5410–5419 (2017)

    Google Scholar 

  6. Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2011), pp. 1297–1304 (2011)

    Google Scholar 

  7. Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 1297–1304 (2011)

    Google Scholar 

  8. Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. arXiv:1411.4280v3 (2015)

  9. Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machine. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2016) (2016)

    Google Scholar 

  10. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29

    Chapter  Google Scholar 

  11. Tome, D., Russell, C., Agapito, L.: Lifting from the deep: convolutional 3D pose estimation from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017) (2017)

    Google Scholar 

  12. Kudo, Y., Ogaki, K., Matsui, Y., Odagiri, Y.: Unsupervised adversarial learning of 3D human pose from 2D joint locations. arXiv:1803.08244v1 (2018)

  13. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the International Conference on Advances in Neural Information Processing Systems (NIPS), pp. 91–99 (2015)

    Google Scholar 

  14. Phon-Amnuaisuk, S., Murata, K.T., Pavarangkoon, P., Mizuhara, T., Hadi, S.: Children activity descriptions from visual and textual associations. In: Chamchong, R., Wong, K.W. (eds.) MIWAI 2019. LNCS (LNAI), vol. 11909, pp. 121–132. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33709-4_11

    Chapter  Google Scholar 

Download references

Acknowledgments

This publication is the output of the ASEAN IVO (http://www.nict.go.jp/en/asean_ivo/index.html) project titled Event Analysis: Applications of computer vision and AI in smart tourism industry and financially supported by NICT (http://www.nict.go.jp/en/index.html). We would also like to thank anonymous reviewers for their constructive comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Somnuk Phon-Amnuaisuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Phon-Amnuaisuk, S., Murata, K.T., Kovavisaruch, LO., Lim, TH., Pavarangkoon, P., Mizuhara, T. (2020). Visual-Based Positioning and Pose Estimation. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Communications in Computer and Information Science, vol 1332. Springer, Cham. https://doi.org/10.1007/978-3-030-63820-7_68

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63820-7_68

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63819-1

  • Online ISBN: 978-3-030-63820-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics