Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach

Zhang, Zhengyou; Liu, Zicheng; Adler, Dennis; Cohen, Michael F.; Hanson, Erik; Shan, Ying

doi:10.1023/B:VISI.0000015915.50080.85

Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach

Published: July 2004

Volume 58, pages 93–119, (2004)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Zhengyou Zhang¹,
Zicheng Liu¹,
Dennis Adler¹,
Michael F. Cohen¹,
Erik Hanson¹ &
…
Ying Shan¹

316 Accesses
58 Citations
12 Altmetric
Explore all metrics

Abstract

We have developed an easy-to-use and cost-effective system to construct textured 3D animated face models from videos with minimal user interaction. This is a particularly challenging task for faces due to a lack of prominent textures. We develop a robust system by following a model-based approach: we make full use of generic knowledge of faces in head motion determination, head tracking, model fitting, and multiple-view bundle adjustment. Our system first takes, with an ordinary video camera, images of a face of a person sitting in front of the camera turning their head from one side to the other. After five manual clicks on two images to indicate the position of the eye corners, nose tip and mouth corners, the system automatically generates a realistic looking 3D human head model that can be animated immediately (different poses, facial expressions and talking). A user, with a PC and a video camera, can use our system to generate his/her face model in a few minutes. The face model can then be imported in his/her favorite game, and the user sees themselves and their friends take part in the game they are playing. We have demonstrated the system on a laptop computer live at many events, and constructed face models for hundreds of people. It works robustly under various environment settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Fast Realistic 3D Face Modeling Algorithm for Film and Television Animation

Rapid 3D Face Modeling from Video

A Comparative Study of Four 3D Facial Animation Methods: Skeleton, Blendshape, Audio-Driven, and Vision-Based Capture

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Akimoto, T., Suenaga, Y., and Wallace, R.S. 1993. Automatic 3d facial models. IEEE Computer Graphics and Applications, 13(5):16-22.
Google Scholar
Barron, J., Fleet, D., and Beauchemin, S. 1994. Performance of optical flowtechniques. The International Journal of ComputerVision, 12(1):43-77.
Google Scholar
Black, M. and Yacoob, Y. 1997. Recognizing facial expressions in image sequences using local parameterized models of image motion. The International Journal of Computer Vision, 25(1):23-48.
Google Scholar
Blanz, V. and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. Computer Graphics, Annual Conference Series, Siggraph, pp. 187-194.
Dariush, B., Kang, S.B., and Waters, K. 1998. Spatiotemporal analysis of face profiles: Detection, segmentation, and registration. In Proc. of the 3rd International Conference on Automatic Face and Gesture Recognition, IEEE, pp. 248-253.
DeCarlo, D., Metaxas, D., and Stone, M. 1998. An anthropometric face model using variational techniques. Computer Graphics, Annual Conference Series, Siggraph, pp. 67-74.
DiPaola, S. 1991. Extending the range of facial types. Journal of Visualization and Computer Animation, 2(4):129-131.
Google Scholar
Ekman, P. and Friesen, W. 1978. The Facial Action Coding System: A Technique for The Measurement of Facial Movement. Consulting Psychologists Press: San Francisco.
Google Scholar
Essa, I. and Pentland, A. 1997. Coding, analysis, interpretation, and recognition of facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):757-763.
Google Scholar
Faugeras, O. 1993. Three-Dimensional Computer Vision: A Geometric Viewpoint, MIT Press.
Fua, P. 2000. Regularized bundle-adjustment to model heads from image sequences without calibration data. The International Journal of Computer Vision, 38(2):153-171.
Google Scholar
Fua, P. and Miccio, C. 1998. From regular images to animated heads: A least squares approach. European Conference on Computer Vision, pp. 188-202.
Fua, P. and Miccio, C. 1999. Animated heads from ordinary images: A least-squares approach. Computer Vision and Image Understanding, 75(3):247-259.
Google Scholar
Fua, P., Plaenkers, R., and Thalmann, D. 1999. From synthesis to analysis: Fitting human animation models to image data. Computer Graphics International, Alberta, Canada.
Gill, P.E., Murray, W., and Wright, M.H. 1981. Practical Optimization. Academic Press.
Guenter, B., Grimm, C., Wood, D., Malvar, H., and Pighin, F. 1998. Making faces. Computer Graphics, Annual Conference Series, Siggraph, pp. 55-66.
Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Proc. 4th Alvey Vision Conf., pp. 189-192.
Horn, B.K. (1987). Closed-form solution of absolute orientation using unit quaternions. Journal of the Optical Society A, 4(4):629-642.
Google Scholar
Horn, B.K.P. and Schunk, B.G. 1981. Determining optical flow. Artificial Intelligence, 17:185-203.
Google Scholar
Ip, H.H. and Yin, L. 1996. Constructing a 3d individualized head model from two orthogonal views. The Visual Computer, (12):254-266.
Google Scholar
Kang, S.B. and Jones, M. 1999. Appearance-based structure from motion using linear classes of 3-d models, Manuscript.
Kass, M., Witkin, A., and Terzopoulos, D. 1988. SNAKES: Active contour models. The International Journal of Computer Vision, 1:321-332.
Google Scholar
Lanitis, A., Taylor, C.J., and Cootes, T.F. 1997. Automatic interpretation and coding of face images using flexible models. IEEE Transations on Pattern Analysis and Machine Intelligence, 19(7):743-756.
Google Scholar
Lee, W. and Magnenat-Thalmann, N. 1998. Head modeling from photographs and morphing in 3d with image metamorphosis based on triangulation. In Proc. CAPTECH'98, Springer LNAI and LNCS Press, Geneva, pp. 254-267.
Google Scholar
Lee, Y.C., Terzopoulos, D., and Waters, K. 1993. Constructing physics-based facial models of individuals. Proceedings of Graphics Interface, pp. 1-8.
Lee, Y.C., Terzopoulos, D., and Waters, K. 1995. Realistic modeling for facial animation. Computer Graphics, Annual Conference Series, SIGGRAPH, pp. 55-62.
Lewis, J.P. 1989. Algorithms for solid noise synthesis. Computer Graphics, Annual Conference Series, Siggraph, pp. 263-270.
Liu, Z. and Zhang, Z. 2001. Robust head motion computation by taking advantage of physical properties. In Proceedings of the IEEE Workshop on Human Motion (HUMO 2000), Austin, USA, pp. 73-77.
Liu, Z., Shan, Y., and Zhang, Z. 2001. Expressive expression mapping with ratio images. Computer Graphics, Annual Conference Series, ACM SIGGRAPH, Los Angeles, pp. 271-276.
Google Scholar
Liu, Z., Zhang, Z., Jacobs, C., and Cohen, M. 2000. Rapid modeling of animated faces from video. In Proc. 3rd International Conference on Visual Computing, Mexico City, pp. 58-67. Also in the special issue of The Journal of Visualization and Computer Animation, Vol. 12, 2001. Also available as MSR technical report from http://research.microsoft.com/~zhang/Papers/TR00-11.pdf.
Google Scholar
Magneneat-Thalmann, N., Minh, H., Angelis, M., and Thalmann, D. 1989. Design, transformation and animation of human faces. Visual Computer, (5):32-39.
Google Scholar
More, J. 1977. The levenberg-marquardt algorithm, implementation and theory. In Numerical Analysis, G.A. Watson (Ed.), Lecture Notes in Mathematics 630, Springer-Verlag.
Parke, F.I. 1972. Computer generated animation of faces, ACM National Conference.
Parke, F.I. 1974. A Parametric Model of Human Faces, PhD thesis, University of Utah.
Parke, F.I. and Waters, K. 1996. Computer Facial Animation, AKPeters, Wellesley, Massachusetts.
Google Scholar
Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D.H. 1998. Synthesizing realistic facial expressions from photographs. Computer Graphics, Annual Conference Series, Siggraph, pp. 75-84.
Platt, S. and Badler, N. 1981. Animating facial expression. Computer Graphics 15(3):245-252.
Google Scholar
Rousseeuw, P. and Leroy, A. 1987. Robust Regression and Outlier Detection, John Wiley & Sons: New York.
Google Scholar
Shakunaga, T., Ogawa, K., and Oki, S. 1998. Integration of eigentemplate and structure matching for automatic facial feature detection. In Proc. of the 3rd International Conference on Automatic Face and Gesture Recognition, pp. 94-99.
Shan, Y., Liu, Z., and Zhang, Z. 2001. Model-based bundle adjustment with application to face modeling. In Proceedings of the 8th International Conference on Computer Vision, Vol. II, IEEE Computer Society Press: Vancouver, Canada, pp. 644-651.
Google Scholar
Tao, H. and Huang, T. 1999. Explanation-based facial motion tracking using a piecewise bezier volume deformation model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. I, IEEE Computer Society: Colorado, pp. 611-617.
Google Scholar
Terzopoulos, D. and Waters, K. 1990. Physically based facial modeling, analysis, and animation. Visualization and Computer Animation, pp. 73-80.
Tian, Y.-L., Kanade, T., and Cohn, J. 2001. Recognizing action units for facial expression analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2):97-115.
Google Scholar
Todd, J.T., Leonard, S.M., Shaw, R.E., and Pittenger, J.B. 1980. The perception of human growth. Scientific American, (1242):106-114.
Vetter, T. and Poggio, T. 1997. Linear object classes and image synthesis from a single example image. IEEE Transations on Pattern Analysis and Machine Intelligence, 19(7):733-742.
Google Scholar
Waters, K. 1987. A muscle model for animating three-dimensional facial expression. Computer Graphics, 22(4):17-24.
Google Scholar
Zhang, Z. 1997. Motion and structure from two perspective views: From essential parameters to euclidean motion via fundamental matrix. Journal of the Optical Society of America A, 14(11):2938-2950.
Google Scholar
Zhang, Z. 1998a. Determining the epipolar geometry and its uncertainty: A review. The International Journal of Computer Vision, 27(2):161-195.
Google Scholar
Zhang, Z. 1998b. On the optimization criteria used in two-view motion analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7):717-729.
Google Scholar
Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330-1334.
Google Scholar
Zheng, J.Y. 1994. Acquiring 3-d models from sequences of contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2):163-178.
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research, One Microsoft Way, Redmond, WA, 98052, USA
Zhengyou Zhang, Zicheng Liu, Dennis Adler, Michael F. Cohen, Erik Hanson & Ying Shan

Authors

Zhengyou Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zicheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Dennis Adler
View author publications
You can also search for this author in PubMed Google Scholar
Michael F. Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Erik Hanson
View author publications
You can also search for this author in PubMed Google Scholar
Ying Shan
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Z., Liu, Z., Adler, D. et al. Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach. International Journal of Computer Vision 58, 93–119 (2004). https://doi.org/10.1023/B:VISI.0000015915.50080.85

Download citation

Issue Date: July 2004
DOI: https://doi.org/10.1023/B:VISI.0000015915.50080.85

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Realistic 3D Face Modeling Algorithm for Film and Television Animation

Rapid 3D Face Modeling from Video

A Comparative Study of Four 3D Facial Animation Methods: Skeleton, Blendshape, Audio-Driven, and Vision-Based Capture

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Realistic 3D Face Modeling Algorithm for Film and Television Animation

Rapid 3D Face Modeling from Video

A Comparative Study of Four 3D Facial Animation Methods: Skeleton, Blendshape, Audio-Driven, and Vision-Based Capture

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation