Multisource surveillance video data coding with hierarchical knowledge library

Chen, Yu; Hu, Ruimin; Xiao, Jing; Xu, Liang; Wang, Zhongyuan

doi:10.1007/s11042-018-6825-4

Multisource surveillance video data coding with hierarchical knowledge library

Published: 13 November 2018

Volume 78, pages 14705–14731, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yu Chen¹,
Ruimin Hu^1,2,
Jing Xiao¹,
Liang Xu¹ &
…
Zhongyuan Wang¹

311 Accesses
Explore all metrics

Abstract

The rapidly increasing surveillance video data has challenged the existing video coding standards. Even though knowledge based video coding scheme has been proposed to remove redundancy of moving objects across multiple videos and achieved great coding efficiency improvement, it still has difficulties to cope with complicated visual changes of objects resulting from various factors. In this paper, a novel hierarchical knowledge extraction method is proposed. Common knowledge on three coarse-to-fine levels, namely category level, object level and video level, are extracted from history data to model the initial appearance, stable changes and temporal changes respectively for better object representation and redundancy removal. In addition, we apply the extracted hierarchical knowledge to surveillance video coding tasks and establish a hybrid prediction based coding framework. On the one hand, hierarchical knowledge is projected to the image plane to generate reference for I frames to achieve better prediction performance. On the other hand, we develop a transform based prediction for P/B frames to reduce the computational complexity while improve the coding efficiency. Experimental results demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Non-overlapped Multi-source Surveillance Video Coding Using Two-Layer Knowledge Dictionary

Detecting Hidden Objects Using Efficient Spatio-Temporal Knowledge Representation

Dynamic-boosting attention for self-supervised video representation learning

Article 01 July 2021

References

Au, O., Li, S., Zou, R., Dai, W., & Sun, L. (2012). Digital photo album compression based on global motion compensation and intra/inter prediction. In Audio, Language and Image Processing (ICALIP), 2012 International Conference on, IEEE, pp. 84-90
Azizpour, H., & Laptev, I. (2012). Object detection using strongly-supervised deformable part models. In European Conference on Computer Vision, Springer, pp. 836-849
Bell S, Bala K, Snavely N (2014) Intrinsic images in the wild. ACM Trans Graph 33(4):159
Article Google Scholar
Bjontegarrd, G. (2001). Calculation of average PSNR differences between RD-curves. VCEG-M33
Chen, C., Cai, J., Lin, W., & Shi, G. (2012). Surveillance video coding via low-rank and sparse decomposition. In Proceedings of the 20th ACM international conference on Multimedia, ACM, pp. 713-716
Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25(5):564–577
Article Google Scholar
Guo, X., Li, S., & Cao, X. (2013). Motion matters: A novel framework for compressing surveillance videos. In Proceedings of the 21st ACM international conference on Multimedia, ACM, pp. 549-552
Hakeem, A., Shafique, K., & Shah, M. (2005). An object-based video coding framework for video sequences obtained from static cameras. In Proceedings of the 13th annual ACM international conference on Multimedia, ACM, pp. 608-617
HM 16.20. https://hevc.hhi.fraunhofer.de. Accessed 14 Sept 2018
Kolmogorov V, Zabin R (2004) What energy functions can be minimized via graph cuts. IEEE Trans Pattern Anal Mach Intell 26(2):147–159
Article Google Scholar
Lin C, Zhao Y, Xiao J, Tillo T (2018) Region-based multiple description coding for multiview video plus depth video. IEEE Trans Multimedia 20(5):1209–1223
Article Google Scholar
Liu, Y., Nie, L., Han, L., Zhang, L., & Rosenblum, D. S. (2015). Action2Activity: Recognizing Complex Activities from Sensor Data. In IJCAI, pp. 1617-1623
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Article Google Scholar
Liu, L., Cheng, L., Liu, Y., Jia, Y., & Rosenblum, D. S. (2016). Recognizing Complex Activities by a Probabilistic Interval-Based Model. In AAAI, pp. 1266-1272
Liu, Y., Zhang, L., Nie, L., Yan, Y., & Rosenblum, D. S. (2016). Fortune Teller: Predicting Your Career Path. In AAAI, pp. 201-207
Ma, C., Liu, D., Peng, X., & Wu, F. (2017). Surveillance video coding with vehicle library. In Image Processing (ICIP), 2017 IEEE International Conference on, IEEE, pp. 270-274
Ng KT, Wu Q, Chan SC, Shum HY (2010) Object-based coding for plenoptic videos. IEEE Trans Circuits Syst Video Technol 20(4):548–562
Article Google Scholar
Paul M (2018) Efficient Multiview Video Coding Using 3-D Coding and Saliency-Based Bit Allocation. IEEE Trans Broadcast 64(2):235–246
Article Google Scholar
Purica AI, Mora EG, Pesquet-Popescu B, Cagnazzo M, Ionescu B (2016) Multiview plus depth video coding with temporal prediction view synthesis. IEEE Trans Circuits Syst Video Technol 26(2):360–374
Article Google Scholar
Shao Z, Cai J, Wang Z (2018) Smart Monitoring Cameras Driven Intelligent Processing to Big Surveillance Video Data. IEEE Transactions on Big Data 4(1):105–116
Article Google Scholar
Shi, Z., Sun, X., & Wu, F. (2013). Feature-based image set compression. In Multimedia and Expo (ICME), 2013 IEEE International Conference on, IEEE, pp. 1-6
Sreedhar, K. K., Aminlou, A., Hannuksela, M. M., & Gabbouj, M. (2016). Standard-compliant multiview video coding and streaming for virtual reality applications. In Multimedia (ISM), 2016 IEEE International Symposium on, IEEE, pp. 295-300
Sullivan GJ, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans circuits syst video technol 22(12):1649–1668
Article Google Scholar
Tan TN, Sullivan GD, Baker KD (1998) Model-based localisation and recognition of road vehicles. Int J Comput Vis 27(1):5–25
Article Google Scholar
Tech G, Chen Y, Müller K, Ohm JR, Vetro A, Wang YK (2016) Overview of the multiview and 3D extensions of high efficiency video coding. IEEE Trans Circuits Syst Video Technol 26(1):35–49
Article Google Scholar
Tsai TH, Lin CY (2012) Exploring contextual redundancy in improving object-based video coding for video sensor networks surveillance. IEEE Trans Multimedia 14(3):669–682
Article Google Scholar
Vetro A, Wiegand T, Sullivan GJ (2011) Overview of the stereo and multiview video coding extensions of the H. 264/MPEG-4 AVC standard. Proc IEEE 99(4):626–642
Article Google Scholar
Waechter, M., Moehrle, N., & Goesele, M. (2014). Let there be color! Large-scale texturing of 3D reconstructions. In European Conference on Computer Vision, Springer, pp. 836-850
Wang, Q., Wang, Z., Xiao, J., Xiao, J., & Li, W. (2016). Fine-Grained Vehicle Recognition in Traffic Surveillance. In Pacific Rim Conference on Multimedia, Springer, pp. 285-295
Wang H, Tian T, Ma M, Wu J (2017) Joint Compression of Near-Duplicate Videos. IEEE Trans Multimedia 19(5):908–920
Article Google Scholar
Weinzaepfel, P., Jégou, H., & Pérez, P. (2011). Reconstructing an image from its local descriptors. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, pp. 337-344
Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H. 264/AVC video coding standard. IEEE Trans circuits syst video technol 13(7):560–576
Article Google Scholar
Wu H, Sun X, Yang J, Zeng W, Wu F (2016) Lossless compression of JPEG coded photo collections. IEEE Trans Image Process 25(6):2684–2696
Article MathSciNet MATH Google Scholar
Xiao J, Hu R, Liao L, Chen Y, Wang Z, Xiong Z (2016) Knowledge-based coding of objects for multisource surveillance video data. IEEE Trans Multimedia 18(9):1691–1706
Article Google Scholar
Yang, Y., Li, B., Li, P., & Liu, Q. (2018). A Two-Stage Clustering Based 3D Visual Saliency Model for Dynamic Scenarios. IEEE Transactions on Multimedia
Yang Y, Liu Q, He X, Liu Z (2019) Cross-View Multi-Lateral Filter for Compressed Multi-View Depth Video. IEEE Trans Image Process 28(1):302–315
Article MathSciNet MATH Google Scholar
Yue H, Sun X, Yang J, Wu F (2013) Cloud-based image coding for mobile devices—Toward thousands to one compression. IEEE Trans Multimedia 15(4):845–857
Article Google Scholar
Zhang X, Tian Y, Huang T, Dong S, Gao W (2014) Optimizing the hierarchical prediction and coding in HEVC for surveillance and conference videos with background modeling. IEEE Trans Image Process 23(10):4511–4526
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China under Grant 61502348, 61671336, 91738302, by the Natural Science Foundation of Jiangsu Province under Grant BK20180234, by the Open Research Fund of State Key Laboratory of Information Engineering in Sureying, Mapping and Remote Sensing, Wuhan University under Grant 17E03, by the National Key R&D Program of China under Grant 2018YFB1201602.

Author information

Authors and Affiliations

National Engineering Research Center for Multimedia and Software, Wuhan University, Wuhan, 430072, China
Yu Chen, Ruimin Hu, Jing Xiao, Liang Xu & Zhongyuan Wang
Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, 430072, China
Ruimin Hu

Authors

Yu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ruimin Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Liang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhongyuan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruimin Hu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Y., Hu, R., Xiao, J. et al. Multisource surveillance video data coding with hierarchical knowledge library. Multimed Tools Appl 78, 14705–14731 (2019). https://doi.org/10.1007/s11042-018-6825-4

Download citation

Received: 02 May 2018
Revised: 14 October 2018
Accepted: 24 October 2018
Published: 13 November 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s11042-018-6825-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Multisource surveillance video data coding with hierarchical knowledge library

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Non-overlapped Multi-source Surveillance Video Coding Using Two-Layer Knowledge Dictionary

Detecting Hidden Objects Using Efficient Spatio-Temporal Knowledge Representation

Dynamic-boosting attention for self-supervised video representation learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now