MobileHand: Real-Time 3D Hand Shape and Pose Estimation from Color Image | SpringerLink
Skip to main content

MobileHand: Real-Time 3D Hand Shape and Pose Estimation from Color Image

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1332))

Included in the following conference series:

Abstract

We present an approach for real-time estimation of 3D hand shape and pose from a single RGB image. To achieve real-time performance, we utilize an efficient Convolutional Neural Network (CNN): MobileNetV3-Small to extract key features from an input image. The extracted features are then sent to an iterative 3D regression module to infer camera parameters, hand shapes and joint angles for projecting and articulating a 3D hand model. By combining the deep neural network with the differentiable hand model, we can train the network with supervision from 2D and 3D annotations in an end-to-end manner. Experiments on two publicly available datasets demonstrate that our approach matches the accuracy of most existing methods while running at over 110 Hz on a GPU or 75 Hz on a CPU.

Supported by Agency for Science, Technology and Research (A*STAR), Nanyang Technological University (NTU) and the National Healthcare Group (NHG). Project code: RFP/19003.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Baek, S., Kim, K.I., Kim, T.: Pushing the envelope for RGB-based dense 3D hand pose estimation via neural rendering. In: CVPR, pp. 1067–1076 (2019)

    Google Scholar 

  2. Bazarevsky, V., Zhang, F.: On-device, real-time hand tracking with mediapipe. Google AI Blog, August 2019

    Google Scholar 

  3. Boukhayma, A., de Bem, R., Torr, P.H.S.: 3D hand shape and pose from images in the wild. In: CVPR, pp. 10835–10844 (2019)

    Google Scholar 

  4. Cai, Y., Ge, L., Cai, J., Yuan, J.: Weakly-supervised 3D hand pose estimation from monocular RGB images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 678–694. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_41

    Chapter  Google Scholar 

  5. Carreira, J., Agrawal, P., Fragkiadaki, K., Malik, J.: Human pose estimation with iterative error feedback. In: CVPR, pp. 4733–4742 (2016)

    Google Scholar 

  6. Ge, L., et al.: 3D hand shape and pose estimation from a single RGB image. In: CVPR, pp. 10825–10834 (2019)

    Google Scholar 

  7. Gouidis, F., Panteleris, P., Oikonomidis, I., Argyros, A.A.: Accurate hand keypoint localization on mobile devices. In: MVA, pp. 1–6 (2019)

    Google Scholar 

  8. Gower, J.: Generalized procrustes analysis. Psychometrika 40(1), 33–51 (1975)

    Article  MathSciNet  Google Scholar 

  9. Hampali, S., Rad, M., Oberweger, M., Lepetit, V.: HOnnotate: a method for 3D annotation of hand and object poses. In: CVPR, pp. 3193–3203 (2020)

    Google Scholar 

  10. Hasson, Y., et al.: Learning joint reconstruction of hands and manipulated objects. In: CVPR, pp. 11799–11808 (2019)

    Google Scholar 

  11. Howard, A., et al.: Searching for mobilenetv3. In: ICCV, pp. 1314–1324 (2019)

    Google Scholar 

  12. Iqbal, U., Molchanov, P., Breuel, T., Gall, J., Kautz, J.: Hand pose estimation via latent 2.5D heatmap regression. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 125–143. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_8

    Chapter  Google Scholar 

  13. Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR, pp. 7122–7131 (2018)

    Google Scholar 

  14. Kulon, D., Güler, R.A., Kokkinos, I., Bronstein, M., Zafeiriou, S.: Weakly-supervised mesh-convolutional hand reconstruction in the wild. In: CVPR (2020)

    Google Scholar 

  15. Lim, G.M., Jatesiktat, P., Kuah, C.W.K., Ang, W.T.: Camera-based hand tracking using a mirror-based multi-view setup. In: EMBC, pp. 5789–5793 (2020)

    Google Scholar 

  16. Mueller, F., et al.: Ganerated hands for real-time 3D hand tracking from monocular RGB. In: CVPR, pp. 49–59 (2018)

    Google Scholar 

  17. Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. ACM TOG 36(6) (2017)

    Google Scholar 

  18. Spurr, A., Song, J., Park, S., Hilliges, O.: Cross-modal deep variational hand pose estimation. In: CVPR, pp. 89–98 (2018)

    Google Scholar 

  19. Zhang, J., Jiao, J., Chen, M., Qu, L., Xu, X., Yang, Q.: A hand pose tracking benchmark from stereo matching. In: ICIP, pp. 982–986 (2017)

    Google Scholar 

  20. Zhang, X., Li, Q., Mo, H., Zhang, W., Zheng, W.: End-to-end hand mesh recovery from a monocular RGB image. In: ICCV, pp. 2354–2364 (2019)

    Google Scholar 

  21. Zhou, X., Wan, Q., Zhang, W., Xue, X., Wei, Y.: Model-based deep hand pose estimation. In: IJCAI, pp. 2421–2427 (2016)

    Google Scholar 

  22. Zhou, Y., Habermann, M., Xu, W., Habibie, I., Theobalt, C., Xu, F.: Monocular real-time hand shape and motion capture using multi-modal data. In: CVPR (2020)

    Google Scholar 

  23. Zimmermann, C., Ceylan, D., Yang, J., Russell, B., Argus, M.J., Brox, T.: Freihand: a dataset for markerless capture of hand pose and shape from single RGB images. In: ICCV, pp. 813–822 (2019)

    Google Scholar 

  24. Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single RGB images. In: ICCV, pp. 4913–4921 (2017)

    Google Scholar 

Download references

Acknowledgments

The computational work for this article was partially performed on resources of the National Supercomputing Centre, Singapore (https://www.nscc.sg).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guan Ming Lim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lim, G.M., Jatesiktat, P., Ang, W.T. (2020). MobileHand: Real-Time 3D Hand Shape and Pose Estimation from Color Image. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Communications in Computer and Information Science, vol 1332. Springer, Cham. https://doi.org/10.1007/978-3-030-63820-7_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63820-7_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63819-1

  • Online ISBN: 978-3-030-63820-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics