Abstract
This paper addresses temporal synchronization of human actions under multiple view situation. Many researchers focused on frame by frame alignment for sync these multi-view videos, and expolited features such as interesting point trajectory or 3d human motion feature for event detecting individual. However, since background are complex and dynamic in real world, traditional image-based features are not fit for video representation. We explore the approach by using robust spatio-temporal features and self-similarity matrices to represent actions across views. Multiple sequences can be aligned their temporal patch(Sliding window) using the Dynamic Time Warping algorithm hierarchically and measured by meta-action classifiers. Two datasets including the Pump and the Olympic dataset are used as test cases. The methods are showed the effectiveness in experiment and suited general video event dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation, segmentation and recognition. Computer Vision and Image Understanding 115(2), 224–241 (2011)
Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing 28(6), 976–990 (2010)
Dexter, E., Prez, P., Laptev, I.: Multi-view Synchronization of Human Actions and Dynamic Scenes. In: Proc. of BMVC 2009, pp. 1–11 (2009)
Zhou, F., Frade, F.: Generalized Time Warping for Multi-modal Alignment of Human Motion. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (June 2012)
Zhou, F., Frade, F., Hodgins, J.: Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 35(3), 582–596 (2013)
Hsu, E., Pulli, K., Popovic, J.: Style translation for human motion. ACM Trans. Graph. 24(3), 1082–1089 (2005)
Farhadi, A., Tabrizi, M., Endres, I., Forsyth, D.: A latent model of discriminative aspect. In: International Conference on Computer Vision - ICCV, pp. 948–955 (2009)
Wedge, D., Huynh, D., Kovesi, P.: Using space-time interest points for video sequence synchronization. In: Proc. IAPR Conf. on Machine Vision Applications, pp. 190–194 (2007)
Gao, Z., Detyniecki, M., Chen, M.-Y., Hauptmann, A.G., Wactlar, H.D., Cai, A.: The Application of Spatio-temporal Feature and Multi-Sensor in Home Medical Devices. International Journal of Digital Content Technology and its Applications (IJDCTA) 4(6), 69–78 (2010)
Chen, M.Y., Hauptmann, A.: MoSIFT: Recognizing human actions in surveillance videos. CMU-CS-09-161, Carnegie Mellon University (2009)
Padua, F.L.C., Carceroni, R.L., Santos, G.A.M.R., Kutulakos, K.N.: Linear sequence-to-sequence alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 32(2), 304–320 (2010)
Junejo, I.N., Dexter, E., Laptev, I., Pérez, P.: Cross-view action recognition from temporal self-similarities. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 293–306. Springer, Heidelberg (2008)
Junejo, I.N., Dexter, E., Laptev, I., Prez, P.: View-Independent Action Recognition from Temporal Self-Similarities. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 172–185 (2011)
Laptev, I., Belongie, S.J., Perez, P., Wills, J.: Periodic motion detection and segmentation via approximate sequence alignment. In: Proc. Int. Conf. on Computer Vision, vol. 1, pp. 816–823 (2005)
Ukrainitz, Y., Irani, M.: Aligning sequences and actions by maximizing space-time correlations. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 538–550. Springer, Heidelberg (2006)
Ushizaki, M., Okatani, T., Deguchi, K.: Video synchronization based on co-occurrence of appearance changes in video sequences. In: Proc. International Conference on Pattern Recognition (ICPR), pp. III:71–III:74 (2006)
Gao, Y., Wang, M., Ji, R., Wu, X., Dai, Q.: 3D Object Retrieval with Hausdorff Distance Learning. IEEE Transactions on Industrial Electronics (2013)
Wolf, L., Zomet, A.: Wide baseline matching between unsynchronized video sequences. International Journal of Computer Vision 68(1), 43–52 (2006)
Xu, D., Chang, S.F.: Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1985–1997 (2008)
Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3D Object Retrieval and Recognition with Hypergraph Analysis. IEEE Transactions on Image Processing 21(9), 4290–4303 (2012)
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Ji, R., Duan, L., Chen, J., Xie, L., Yao, H., Gao, W.: Learning to distribute vocabulary indexing for scalable visual search. IEEE Transactions on Multimedia (2013)
Zhang, L.F., Guan, Z.Y., Hauptmann, A.: Co-Attention model for tiny activity analysis. Neurocomputing 105(1), 51–60 (2013)
Gao, Y., Tang, J.H., Hong, R.C., Yan, S.C., Dai, Q.H., Zhang, N., Chua, T.S.: Camera Constraint-Free View-Based 3D Object Retrieval. IEEE Transactions on Image Processing 21(4), 2269–2281 (2012)
Gao, Y., Wang, M., Zha, Z., Tian, Q., Dai, Q., Zhang, N.: Less is More: Efficient 3D Object Retrieval with Query View Selection. IEEE Transactions on Multimedia 11(5), 1007–1018 (2011)
Ji, R., Yao, H., Liu, W., Sun, X., Tian, Q.: Task-dependent visual-codebook compression. IEEE Transactions on Image Processing 21(4), 2282–2293
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, L., Tang, S., Singhal, S., Ding, G. (2014). Multi-view Action Synchronization in Complex Background. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds) MultiMedia Modeling. MMM 2014. Lecture Notes in Computer Science, vol 8326. Springer, Cham. https://doi.org/10.1007/978-3-319-04117-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-04117-9_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04116-2
Online ISBN: 978-3-319-04117-9
eBook Packages: Computer ScienceComputer Science (R0)