Multi-cue hand detection and tracking for a head-mounted augmented reality system | Machine Vision and Applications Skip to main content
Log in

Multi-cue hand detection and tracking for a head-mounted augmented reality system

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

With the recent developments in wearable augmented reality (AR), the role of natural human–computer interaction is becoming more important. Utilization of auxiliary hardware for interaction introduces extra complexity, weight and cost to wearable AR systems and natural means of interaction such as gestures are therefore more desirable. In this paper, we present a novel multi-cue hand detection and tracking method for head-mounted AR systems which combines depth, color, intensity and curvilinearity. The combination of different cues increases the detection rate, eliminates the background regions and therefore increases the tracking performance under challenging conditions. Detected hand positions and the trajectories are used to perform actions such as click, select, etc. Moreover, the 6 DOF poses of the hands are calculated by approximating the segmented regions with planes in order to render a planar menu (interface) around the hand and use the hand as a planar selection tool. The proposed system is tested on different scenarios (including markers for reference) and the results show that our system can detect and track the hands successfully in challenging conditions such as cluttered and dynamic environments and illumination variance. The proposed hand tracker outperforms other well-known hand trackers under these conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Appenrodt, J., Al-Hamadi, A., Elmezain, M., Michaelis, B.: Data gathering for gesture recognition systems based on mono color-, stereo color- and thermal cameras. In: Future Generation Information Technology, Lecture Notes in Computer Science, pp. 78–86. Springer, Berlin (2009)

  2. Argyros, A.A., Lourakis, M.I.A.: Real-time tracking of multiple skin-colored objects with a possibly moving camera. In: ECCV, pp. 368–379 (2004)

  3. Baraldi, S., Bimbo, A.D., Landucci, L., Valli, A.: wikitable: finger driven interaction for collaborative knowledge-building workspaces. In: Computer Vision and Pattern Recognition Workshop, vol. 144 (2006)

  4. Bradski, G.: Computer Video Face Tracking for use in a Perceptual User Interface. Technical report Intel (1998)

  5. Caglar, M., Lobo, N.: Open hand detection in a cluttered single image using finger primitives. In: Computer Vision and Pattern Recognition Workshop, 2006. CVPRW ’06. Conference on, p. 148 (2006)

  6. Canny, J.F.: Finding Edges and Lines in Images. Technial report. MIT Artificial Intelligence Laboratory (1983)

  7. Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intel. 24(5), 603–619 (2002)

    Article  Google Scholar 

  8. Coogan, T., Awad, G., Han, J., Sutherland, A.: Real time hand gesture recognition including hand segmentation and tracking. In: Advances in Visual Computing, Lecture Notes in Computer Science, pp. 495–504. Springer, Berlin (2006)

  9. de La Gorce, M., Paragios, N., Fleet, D.J.: Model-based hand tracking with texture, shading and self-occlusions. In: IEEE Computer Society Conference on, Computer Vision and Pattern Recognition, pp. 1–8 (2008)

  10. Darrell, T., Gordon, G., Harville, M., Woodfill, J.: Integrated person tracking using stereo, color, and pattern detection. Int. J. Comput. Vis. 37, 175–185 (2000)

    Article  MATH  Google Scholar 

  11. Delamarre, Q., Faugeras, O.: Finding pose of hand in video images: a stereo-based approach. In: Proceedings. Third IEEE International Conference on, Automatic Face and Gesture Recognition, 1998, pp. 585–590 (1998)

  12. Deselaers, T., Criminisi, A., Winn, J., Agarwal, A.: Incorporating on-demand stereo for real time recognition. In: IEEE Conference on, Computer Vision and Pattern Recognition, 2007. CVPR ’07, pp. 1–8 (2007)

  13. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification (2nd edn.). Wiley-Interscience, London (2000)

    Google Scholar 

  14. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    Google Scholar 

  15. Foxlin, E., Altshuler, Y., Naimark, L., Harrington, M.: Flighttracker: A novel optical/inertial tracker for cockpit enhanced vision. In: Proceedings of Third IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 212–221 (2004)

  16. Fuchs, H., Livingston, M.A., Raskar, R., Colucci, D., Keller, K., State, A., Crawford, J.R., Rademacher, P., Drake, S.H., Meyer, A.A.: Augmented reality visualization for laparoscopic surgery. In: Proceedings of the First International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI ’98, pp. 934–943. Springer, London (1998)

  17. Ghobadi, S., Loepprich, O., Ahmadov, F., Bernshausen, J., Hartmann, K., Loffeld, O.: Real time hand based robot control using 2d/3d images. In: Advances in Visual Computing, Lecture Notes in Computer Science, pp. 307–316. Springer, Berlin (2008)

  18. Graf, H.P., Cosatto, E., Gibbon, D., Kocheisen, M., Petaja, E.: Multi-modal system for locating heads and faces. In: IEEE International Conference on, Automatic Face and Gesture Recognition, p. 88 (1996)

  19. Grzeszcuk, R., Bradski, G., Chu, M., Bouguet, J.Y.: Stereo based gesture recognition invariant to 3d pose and lighting. In: Proceedings. IEEE Conference on, Computer Vision and Pattern Recognition, 2000, vol. 1, pp. 826–833 (2000)

  20. Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intel. 30, 328–341 (2008)

    Google Scholar 

  21. Irawati, S., Green, S., Billinghurst, M., Duenser, A., Ko, H.: “Move the couch where?”: developing an augmented reality multimodal interface. In: IEEE/ACM International Symposium on, Mixed and Augmented Reality, 2006. ISMAR 2006, pp. 183–186 (2006)

  22. Jones, M.J., Rehg, J.M.: Statistical color models with application to skin detection. Int. J. Comput. Vis. 46, 81–96 (2002)

    Article  MATH  Google Scholar 

  23. Kato, H., Billinghurst, M.: Marker tracking and hmd calibration for a video-based augmented reality conferencing system. In: Proceedings of the 2nd IEEE and ACM International Workshop on Augmented Reality, p. 85. IEEE Computer Society, Washington, DC (1999)

  24. Kerawalla, L., Luckin, R., Seljeflot, S., Woolard, A.: Making it real: exploring the potential of augmented reality for teaching primary school science. Virtual Real. 10, 163–174 (2006)

    Google Scholar 

  25. Koller, T., Gerig, G., Szekely, G., Dettwiler, D.: Multiscale detection of curvilinear structures in 2-d and 3-d image data. In: Proceedings., Fifth International Conference on, Computer Vision, 1995. pp. 864–869 (1995)

  26. Kolsch, M., Turk, M.: Fast 2d hand tracking with flocks of features and multi-cue integration. In: Computer Vision and Pattern Recognition Workshop, vol 10, p. 158 (2004)

  27. Lee, M., Green, R., Billinghurst, M.: 3d natural hand interaction for ar applications. In: Image and Vision Computing New Zealand, 2008. IVCNZ 2008. 23rd International Conference, pp. 1–6 (2008)

  28. Lee, S.H., Yoon, Y.I., Choi, J.H., Lee, C.W., Kim, J.T., Choi, J.S.: AR squash game. In: IEEE International Conference on, Information Reuse and Integration, 2006, pp. 579–584 (2006)

  29. Lee, T., Hollerer, T.: Handy AR: Markerless inspection of augmented reality objects using fingertip tracking. In: 11th International Symposium on Wearable Computers (2007)

  30. Lu, S., Metaxas, D., Samaras, D., Oliensis, J.: Using multiple cues for hand tracking and model refinement. In: Proceedings. 2003 IEEE Computer Society Conference on, Computer Vision and Pattern Recognition, 2003. vol. 2, pp. II-443–II-50 (2003)

  31. MacLean, J., Herpers, R., Pantofaru, C., Wood, L., Derpanis, K., Topalovic, D., Tsotsos, J.: Fast hand gesture recognition for real-time teleconferencing applications. In: Proceedings. IEEE ICCV Workshop on, Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, 2001, pp. 133–140 (2001)

  32. Malassiotis, S., Tsalakanidou, F., Mavridis, N., Giagourta, V., Grammalidis, N., Strintzis, M.G.: A face and gesture recognition system based on an active stereo sensor. In: Proceedings of International Conference on Image Processing. Thessaloniki, pp. 7–10 (2001)

  33. Manders, C., Farbiz, F., Chong, J., Tang, K., Chua, G., Loke, M., Yuan, M.: Robust hand tracking using a skin tone and depth joint probability model. In: 8th IEEE International Conference on, Automatic Face Gesture Recognition, 2008. FG ’08. pp. 1–6 (2008)

  34. Merrill, D., Maes, P.: Augmenting looking, pointing and reaching gestures to enhance the searching and browsing of physical objects. In: Pervasive Computing, Lecture Notes in Computer Science, pp. 1–18. Springer, Berlin (2007)

  35. Ng, C.W., Ranganath, S.: Real-time gesture recognition system and application. Image Vis. Comput. 20(13–14), 993–1007 (2002)

    Google Scholar 

  36. Oka, K., Sato, Y., Koike, H.: Real-time tracking of multiple fingertips and gesture recognition for augmented desk interface systems. In: Proceedings. Fifth IEEE International Conference on, Automatic Face and Gesture Recognition, 2002, pp. 429–434 (2002)

  37. Park, J., Yoon, Y.L.: Led-glove based interactions in multi-modal displays for teleconferencing. In: ICAT ’06. 16th International Conference on, Artificial Reality and Telexistence-Workshops, 2006. pp. 395–399 (2006)

  38. Petersen, N., Stricker, D.: Fast hand detection using posture invariant constraints. In: Mertsching, B., Hund, M., Aziz, Z. (eds.) KI 2009: Advances in Artificial Intelligence, Lecture Notes in Computer Science, vol. 5803, pp. 106–113. Springer, Berlin (2009)

  39. Piekarski, W., Thomas, B.H.: Thumbsup: Integrated command and pointer interactions for mobile outdoor augmented reality systems. In: HCI International (2003)

  40. Poelman, R., Akman, O., Lukosch, S., Jonker, P.: As if being there mediated reality for crime scene investigation. In: The ACM Conference on Computer Supported Cooperative Work (2012)

  41. Saxe, D., Foulds, R.: Toward robust skin identification in video images. In: Proceedings of the Second International Conference on, Automatic Face and Gesture Recognition, 1996, pp. 379–384 (1996)

  42. Schlattman, M., Klein, R.: Simultaneous 4 gestures 6 dof real-time two-hand tracking without any markers. In: VRST ’07: Proceedings of the 2007 ACM symposium on Virtual reality software and technology, pp. 39–42. ACM, New York (2007)

  43. Soler, L., Nicolau, S., Schmid, J., Koehl, C., Marescaux, J., Pennec, X., Ayache, N.: Virtual reality and augmented reality in digestive surgery. In: Third IEEE and ACM International Symposium on, Mixed and Augmented Reality, 2004. ISMAR 2004. pp. 278–279 (2004)

  44. Soutschek, S., Penne, J., Hornegger, J., Kornhuber, J.: 3-d gesture-based scene navigation in medical imaging applications using time-of-flight cameras. In: IEEE Computer Society Conference on, Computer Vision and Pattern Recognition Workshops, 2008. CVPRW ’08, pp. 1–6 (2008)

  45. Wang, Q., Chen, X., Gao, W.: Skin color weighted disparity competition for hand segmentation from stereo camera. In: Proceedings of the British Machine Vision Conference, pp. 66.1–66.11. BMVA Press (2010)

  46. Wang, R.Y., Popovic, J.: Real-time hand-tracking with a color glove. ACM Trans. Graph. 28(3), 63-1–63-8 (2009)

    Google Scholar 

  47. Welch, G., Bishop, G.: An Introduction to the Kalman Filter. Technical report. Chapel Hill (1995)

  48. Yang, M.H., Kriegman, D., Ahuja, N.: Detecting faces in images: a survey. IEEE Trans. Pattern Anal. Mach. Intel. 24(1), 34–58 (2002)

    Article  Google Scholar 

  49. Ye, G., Corso, J., Hager, G.: Gesture recognition using 3d appearance and motion features. In: CVPRW ’04. Conference on, Computer Vision and Pattern Recognition Workshop, 2004, p. 160 (2004)

  50. Zhu, Y., Xu, G., Kriegman, D.J.: A real-time approach to the spotting, representation, and recognition of hand gestures for human-computer interaction. Comput. Vis. Image Underst. 85(3), 189–208 (2002)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oytun Akman.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akman, O., Poelman, R., Caarls, W. et al. Multi-cue hand detection and tracking for a head-mounted augmented reality system. Machine Vision and Applications 24, 931–946 (2013). https://doi.org/10.1007/s00138-013-0500-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-013-0500-6

Keywords

Navigation