Abstract
We have developed an easy-to-use and cost-effective system to construct textured 3D animated face models from videos with minimal user interaction. This is a particularly challenging task for faces due to a lack of prominent textures. We develop a robust system by following a model-based approach: we make full use of generic knowledge of faces in head motion determination, head tracking, model fitting, and multiple-view bundle adjustment. Our system first takes, with an ordinary video camera, images of a face of a person sitting in front of the camera turning their head from one side to the other. After five manual clicks on two images to indicate the position of the eye corners, nose tip and mouth corners, the system automatically generates a realistic looking 3D human head model that can be animated immediately (different poses, facial expressions and talking). A user, with a PC and a video camera, can use our system to generate his/her face model in a few minutes. The face model can then be imported in his/her favorite game, and the user sees themselves and their friends take part in the game they are playing. We have demonstrated the system on a laptop computer live at many events, and constructed face models for hundreds of people. It works robustly under various environment settings.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Akimoto, T., Suenaga, Y., and Wallace, R.S. 1993. Automatic 3d facial models. IEEE Computer Graphics and Applications, 13(5):16-22.
Barron, J., Fleet, D., and Beauchemin, S. 1994. Performance of optical flowtechniques. The International Journal of ComputerVision, 12(1):43-77.
Black, M. and Yacoob, Y. 1997. Recognizing facial expressions in image sequences using local parameterized models of image motion. The International Journal of Computer Vision, 25(1):23-48.
Blanz, V. and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. Computer Graphics, Annual Conference Series, Siggraph, pp. 187-194.
Dariush, B., Kang, S.B., and Waters, K. 1998. Spatiotemporal analysis of face profiles: Detection, segmentation, and registration. In Proc. of the 3rd International Conference on Automatic Face and Gesture Recognition, IEEE, pp. 248-253.
DeCarlo, D., Metaxas, D., and Stone, M. 1998. An anthropometric face model using variational techniques. Computer Graphics, Annual Conference Series, Siggraph, pp. 67-74.
DiPaola, S. 1991. Extending the range of facial types. Journal of Visualization and Computer Animation, 2(4):129-131.
Ekman, P. and Friesen, W. 1978. The Facial Action Coding System: A Technique for The Measurement of Facial Movement. Consulting Psychologists Press: San Francisco.
Essa, I. and Pentland, A. 1997. Coding, analysis, interpretation, and recognition of facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):757-763.
Faugeras, O. 1993. Three-Dimensional Computer Vision: A Geometric Viewpoint, MIT Press.
Fua, P. 2000. Regularized bundle-adjustment to model heads from image sequences without calibration data. The International Journal of Computer Vision, 38(2):153-171.
Fua, P. and Miccio, C. 1998. From regular images to animated heads: A least squares approach. European Conference on Computer Vision, pp. 188-202.
Fua, P. and Miccio, C. 1999. Animated heads from ordinary images: A least-squares approach. Computer Vision and Image Understanding, 75(3):247-259.
Fua, P., Plaenkers, R., and Thalmann, D. 1999. From synthesis to analysis: Fitting human animation models to image data. Computer Graphics International, Alberta, Canada.
Gill, P.E., Murray, W., and Wright, M.H. 1981. Practical Optimization. Academic Press.
Guenter, B., Grimm, C., Wood, D., Malvar, H., and Pighin, F. 1998. Making faces. Computer Graphics, Annual Conference Series, Siggraph, pp. 55-66.
Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Proc. 4th Alvey Vision Conf., pp. 189-192.
Horn, B.K. (1987). Closed-form solution of absolute orientation using unit quaternions. Journal of the Optical Society A, 4(4):629-642.
Horn, B.K.P. and Schunk, B.G. 1981. Determining optical flow. Artificial Intelligence, 17:185-203.
Ip, H.H. and Yin, L. 1996. Constructing a 3d individualized head model from two orthogonal views. The Visual Computer, (12):254-266.
Kang, S.B. and Jones, M. 1999. Appearance-based structure from motion using linear classes of 3-d models, Manuscript.
Kass, M., Witkin, A., and Terzopoulos, D. 1988. SNAKES: Active contour models. The International Journal of Computer Vision, 1:321-332.
Lanitis, A., Taylor, C.J., and Cootes, T.F. 1997. Automatic interpretation and coding of face images using flexible models. IEEE Transations on Pattern Analysis and Machine Intelligence, 19(7):743-756.
Lee, W. and Magnenat-Thalmann, N. 1998. Head modeling from photographs and morphing in 3d with image metamorphosis based on triangulation. In Proc. CAPTECH'98, Springer LNAI and LNCS Press, Geneva, pp. 254-267.
Lee, Y.C., Terzopoulos, D., and Waters, K. 1993. Constructing physics-based facial models of individuals. Proceedings of Graphics Interface, pp. 1-8.
Lee, Y.C., Terzopoulos, D., and Waters, K. 1995. Realistic modeling for facial animation. Computer Graphics, Annual Conference Series, SIGGRAPH, pp. 55-62.
Lewis, J.P. 1989. Algorithms for solid noise synthesis. Computer Graphics, Annual Conference Series, Siggraph, pp. 263-270.
Liu, Z. and Zhang, Z. 2001. Robust head motion computation by taking advantage of physical properties. In Proceedings of the IEEE Workshop on Human Motion (HUMO 2000), Austin, USA, pp. 73-77.
Liu, Z., Shan, Y., and Zhang, Z. 2001. Expressive expression mapping with ratio images. Computer Graphics, Annual Conference Series, ACM SIGGRAPH, Los Angeles, pp. 271-276.
Liu, Z., Zhang, Z., Jacobs, C., and Cohen, M. 2000. Rapid modeling of animated faces from video. In Proc. 3rd International Conference on Visual Computing, Mexico City, pp. 58-67. Also in the special issue of The Journal of Visualization and Computer Animation, Vol. 12, 2001. Also available as MSR technical report from http://research.microsoft.com/~zhang/Papers/TR00-11.pdf.
Magneneat-Thalmann, N., Minh, H., Angelis, M., and Thalmann, D. 1989. Design, transformation and animation of human faces. Visual Computer, (5):32-39.
More, J. 1977. The levenberg-marquardt algorithm, implementation and theory. In Numerical Analysis, G.A. Watson (Ed.), Lecture Notes in Mathematics 630, Springer-Verlag.
Parke, F.I. 1972. Computer generated animation of faces, ACM National Conference.
Parke, F.I. 1974. A Parametric Model of Human Faces, PhD thesis, University of Utah.
Parke, F.I. and Waters, K. 1996. Computer Facial Animation, AKPeters, Wellesley, Massachusetts.
Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D.H. 1998. Synthesizing realistic facial expressions from photographs. Computer Graphics, Annual Conference Series, Siggraph, pp. 75-84.
Platt, S. and Badler, N. 1981. Animating facial expression. Computer Graphics 15(3):245-252.
Rousseeuw, P. and Leroy, A. 1987. Robust Regression and Outlier Detection, John Wiley & Sons: New York.
Shakunaga, T., Ogawa, K., and Oki, S. 1998. Integration of eigentemplate and structure matching for automatic facial feature detection. In Proc. of the 3rd International Conference on Automatic Face and Gesture Recognition, pp. 94-99.
Shan, Y., Liu, Z., and Zhang, Z. 2001. Model-based bundle adjustment with application to face modeling. In Proceedings of the 8th International Conference on Computer Vision, Vol. II, IEEE Computer Society Press: Vancouver, Canada, pp. 644-651.
Tao, H. and Huang, T. 1999. Explanation-based facial motion tracking using a piecewise bezier volume deformation model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol. I, IEEE Computer Society: Colorado, pp. 611-617.
Terzopoulos, D. and Waters, K. 1990. Physically based facial modeling, analysis, and animation. Visualization and Computer Animation, pp. 73-80.
Tian, Y.-L., Kanade, T., and Cohn, J. 2001. Recognizing action units for facial expression analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2):97-115.
Todd, J.T., Leonard, S.M., Shaw, R.E., and Pittenger, J.B. 1980. The perception of human growth. Scientific American, (1242):106-114.
Vetter, T. and Poggio, T. 1997. Linear object classes and image synthesis from a single example image. IEEE Transations on Pattern Analysis and Machine Intelligence, 19(7):733-742.
Waters, K. 1987. A muscle model for animating three-dimensional facial expression. Computer Graphics, 22(4):17-24.
Zhang, Z. 1997. Motion and structure from two perspective views: From essential parameters to euclidean motion via fundamental matrix. Journal of the Optical Society of America A, 14(11):2938-2950.
Zhang, Z. 1998a. Determining the epipolar geometry and its uncertainty: A review. The International Journal of Computer Vision, 27(2):161-195.
Zhang, Z. 1998b. On the optimization criteria used in two-view motion analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7):717-729.
Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330-1334.
Zheng, J.Y. 1994. Acquiring 3-d models from sequences of contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2):163-178.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Zhang, Z., Liu, Z., Adler, D. et al. Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach. International Journal of Computer Vision 58, 93–119 (2004). https://doi.org/10.1023/B:VISI.0000015915.50080.85
Issue Date:
DOI: https://doi.org/10.1023/B:VISI.0000015915.50080.85