Learning Articulated Structure and Motion

Ross, David A.; Tarlow, Daniel; Zemel, Richard S.

doi:10.1007/s11263-010-0325-y

Learning Articulated Structure and Motion

Published: 02 March 2010

Volume 88, pages 214–237, (2010)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

David A. Ross¹,
Daniel Tarlow¹ &
Richard S. Zemel¹

470 Accesses
3 Altmetric
Explore all metrics

Abstract

Humans demonstrate a remarkable ability to parse complicated motion sequences into their constituent structures and motions. We investigate this problem, attempting to learn the structure of one or more articulated objects, given a time series of two-dimensional feature positions. We model the observed sequence in terms of “stick figure” objects, under the assumption that the relative joint angles between sticks can change over time, but their lengths and connectivities are fixed. The problem is formulated as a single probabilistic model that includes multiple sub-components: associating the features with particular sticks, determining the proper number of sticks, and finding which sticks are physically joined. We test the algorithm on challenging datasets of 2D projections of optical human motion capture and feature trajectories from real videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Abdel-Malek, K., Arora, J., Beck, S., Bhatti, M., Carroll, J., Cook, T., Dasgupta, S., Grosland, N., Han, R., Kim, H., Lu, J., Swan, C., Williams, A., & Yang, J. Digital human modeling and virtual reality for FCS (Technical Report VSR-04.02). The Virtual Soldier Research (VSR) Program, Center for Computer-Aided Design, College of Engineering, The University of Iowa, October 2004.
Bray, M., Kohli, P., & Torr, P. (2006). Posecut: Simultaneous segmentation and 3d pose estimation of humans using dynamic graph-cuts. In ECCV (2), pp. 642–655.
Costeira, J., & Kanade, T. (1996). A multi-body factorization method for motion analysis. In Image understanding workshop (pp. 1013–1026).
Costeira, J. P., & Kanade, T. (1998). A multibody factorization method for independently moving-objects. International Journal of Computer Vision, 29(3), 159–179.
Article Google Scholar
Cover, T.M., & Thomas, J.A. (1991). Elements of information theory. New York: Wiley-Interscience.
Book MATH Google Scholar
Culverhouse, P. F., & Wang, H. (2003). Robust motion segmentation by spectral clustering. In British machine vision conference (pp. 639–648).
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, 39, 1–38.
MATH MathSciNet Google Scholar
Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315, 972–976.
Article MathSciNet Google Scholar
Gear, C. W. (1998). Multibody grouping from motion images. International Journal of Computer Vision, 29(2), 133–150. doi:10.1023/A:1008026310903. ISSN 0920-5691.
Article Google Scholar
Ghahramani, Z., & Hinton, G. E. (1996a). The EM algorithm for mixtures of factor analyzers (Technical Report CRG-TR-96-1). University of Toronto.
Ghahramani, Z., & Hinton, G. E. (1996b). Parameter estimation for linear dynamical systems (Technical Report CRG-TR-96-2). University of Toronto.
Golub, G. H., & Van Loan, C. F. (1996). Matrix computations. Baltimore: Johns Hopkins Press.
MATH Google Scholar
Gruber, A., & Weiss, Y. (2003). Factorization with uncertainty and missing data: Exploiting temporal coherence. In Thrun, S., Saul, L. K., & Schölkopf, B. (Eds.) Advances in Neural Information Processing Systems. Cambridge: MIT Press. ISBN0-262-20152-6.
Google Scholar
Gruber, A., & Weiss, Y. (2004). Multibody factorization with uncertainty and missing data using the EM algorithm. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 707–714).
Hartley, R., & Zisserman, A. (2003). Multiple view geometry. Cambridge: Cambridge University Press.
Google Scholar
Herda, L., Fua, P., Plankers, R., Boulic, R., & Thalmann, D. (2001). Using skeleton-based tracking to increase the reliability of optical motion capture. Human Movement Science Journal, 20(3), 313–341.
Article Google Scholar
Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception and Psychophysics, 14, 201–211.
Google Scholar
Kirk, A. G., O’Brien, J. F., & Forsyth, D. A. (2005). Skeletal parameter estimation from optical motion capture data. In Proceedings of IEEE conference on computer vision and pattern recognition. Los Alamitos: IEEE Comput. Soc. ISBN 0-7695-2372-2.
Google Scholar
Neal, R., & Hinton, G. (1998). A view of the em algorithm that justifies incremental, sparse, and other variants. In Jordan, M. I. (Ed.) Learning in graphical models. Norwell: Kluwer Academic.
Google Scholar
Ng, A. Y., Jordan, M. I., & Weiss, Y. (2002). On spectral clustering: analysis and an algorithm. In Advances in neural information processing systems (NIPS).
Ross, D. A. (2008a). Learning probabilistic models for visual motion (PhD thesis). University of Toronto, Ontario, Canada.
Ross, D. A. (2008b). Learning probabilistic models for visual motion (PhD thesis). University of Toronto, Toronto, Ontario, Canada.
Ross, D. A., & Zemel, R. S. (2006). Learning parts-based representations of data. Journal of Machine Learning Research, 7, 2369–2397.
MathSciNet Google Scholar
Ross, D. A., Tarlow, D., & Zemel, R. S. (2007). Learning articulated skeletons from motion. In Workshop on dynamical vision at ICCV.
Ross, D. A., Tarlow, D., & Zemel, R. S. (2008). Unsupervised learning of skeletons from motion. In Forsyth, D., Torr, P., & Zisserman, A. (Eds.) Proceedings of the 10th European conference on computer vision (ECCV 2008). Berlin: Springer.
Google Scholar
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
Article Google Scholar
Shi, J., & Tomasi, C. (1994). Good features to track. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), (pp. 593–600).
Silaghi, M. C., Plankers, R., Boulic, R., Fua, P., & Thalmann, D. (1998). Local and global skeleton fitting techniques for optical motion capture, modeling and motion capture techniques for virtual environments. In Lecture notes in artificial intelligence (pp. 26–40). Berlin: Springer.
Google Scholar
Sminchisescu, C., & Triggs, B. (2003). Estimating articulated human motion with covariance scaled sampling. International Journal of Robotics Research, 22(6), 371–393.
Article Google Scholar
Song, Y., Goncalves, L., & Perona, P. (2003). Unsupervised learning of human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(7), 814–827.
Article Google Scholar
Song, Y., Goncalves, L., & Perona, P. (2001). Learning probabilistic structure for human motion detection. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 771–777). Los Alamitos: IEEE Comput. Soc. ISBN 0-7695-1272-0.
Google Scholar
Taycher, L., Fisher III, J. W., & Darrell, T. (2002). Recovering articulated model topology from observed rigid motion. In Becker, S., Thrun, S., & Obermayer, K. (Eds.) Advances in neural information processing systems (NIPS) (pp. 1311–1318). Cambridge: MIT Press.
Google Scholar
Tomasi, C., & Kanade, T. (1992). Shape and motion from image streams under orthography: a factorization method. International Journal of Computer Vision, 9, 137–154.
Article Google Scholar
Tresadern, P., & Reid, I. (2005). Articulated structure from motion by factorization. In CVPR ’05: proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05) (Vol. 2, pp. 1110–1115). Washington: IEEE Comput. Soc. doi:10.1109/CVPR.2005.75. ISBN 0-7695-2372-2.
Chapter Google Scholar
Viklands, T. (2006). Algorithms for the weighted orthogonal Procrustes problem and other least squares problems (PhD thesis). Umeå University, Umeå, Sweden.
Weiss, Y. (1999). Segmentation using eigenvectors: a unifying view. In Proceedings of the international conference on computer vision (ICCV).
Yan, J., & Pollefeys, M. (2005a). Factorization-based approach to articulated motion recovery. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).
Yan, J., & Pollefeys, M. (2005b). Articulated motion segmentation using ransac with priors. In Workshop on dynamical vision (ICCV).
Yan, J., & Pollefeys, M. (2006a). A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In Proceedings computer vision—ECCV 2006, 9th European conference on computer vision, Part III, Graz, Austria, May 7–13.
Yan, J., & Pollefeys, M. (2006b). Automatic kinematic chain building from feature trajectories of articulated objects. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).
Yan, J., & Pollefeys, M. (2008). A factorization-based approach for articulated nonrigid shape, motion and kinematic chain recovery from video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5), 865–877. ISSN 0162-8828. http://doi.ieeecomputersociety.org/10.1109/TPAMI.2007.70739.
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Toronto, 10 King’s College Road, Toronto, ON, M5S 3G4, Canada
David A. Ross, Daniel Tarlow & Richard S. Zemel

Authors

David A. Ross
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Tarlow
View author publications
You can also search for this author in PubMed Google Scholar
Richard S. Zemel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David A. Ross.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ross, D.A., Tarlow, D. & Zemel, R.S. Learning Articulated Structure and Motion. Int J Comput Vis 88, 214–237 (2010). https://doi.org/10.1007/s11263-010-0325-y

Download citation

Received: 21 July 2008
Accepted: 09 February 2010
Published: 02 March 2010
Issue Date: June 2010
DOI: https://doi.org/10.1007/s11263-010-0325-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Learning Articulated Structure and Motion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Segmentation of Sparse 3D Point Trajectories Using Group Theoretical Invariants

Unsupervised separation of dynamics from pixels

Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Learning Articulated Structure and Motion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Segmentation of Sparse 3D Point Trajectories Using Group Theoretical Invariants

Unsupervised separation of dynamics from pixels

Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation