Key moment extraction for designing an agglomerative clustering algorithm-based video summarization framework | Neural Computing and Applications Skip to main content
Log in

Key moment extraction for designing an agglomerative clustering algorithm-based video summarization framework

  • S.I. : Machine Learning Applications for Security
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Video summarization is the process of refining the original video into a more concise form without losing valuable information. Both efficient storage and extraction of valuable information from a video are the challenging tasks in video analysis. Intelligent video surveillance system has an essential role for ensuring safety and security to the public. Recent intelligent technologies are extensively using the surveillance systems in all areas starting from border security application to street monitoring systems. Now the surveillance camera or motion sensitivity-based cameras produce large volume of data when employed for recording videos. As analysis of videos by humans demands immense manpower, automatic video summarization is an important and growing research topic. Hence, it is necessary to summarize the activities in the scene and eliminate unusual and redundant events recorded in videos. The proposed work has developed a video summarization framework using key moment-based frame selection and clustering of frames to identify only informative frames. The key moment is a simple yet effective characteristic for summarizing a long video shot and motion is the most salient feature in presenting actions or events in video which is used here to extract the key moments of the video frames. The motion is the scene of a video frame which has the most acceleration and deceleration in case of the key moments. Based on the extracted key moments, the frames of the video are partitioned into different groups using a novel similarity-based agglomerative clustering algorithm. The algorithm determines at most K clusters of frames based on Jaccard similarity among the clusters, where K is the user defined parameter set as the 5% to 15% of the size of the video to be summarized. From each cluster, few representative frames are identified based on the centroids of the clusters and arranged according to their original video sequence to generate the summary of the video. The proposed clustering algorithm and the summarization method are evaluated using state-of-the-art video datasets and compared with some related methodologies to demonstrate their effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: application to face recognition. IEEE Transactions Patt Anal Mach Intell 28(12):2037–2041

    Article  MATH  Google Scholar 

  2. Aigrain P, Zhang H, Petkovic D (1996) Content-based representation and retrieval of visual media: a state-of-the-art review. Multimedia Tools Appl 3(3):179–202

    Article  Google Scholar 

  3. Brock G, Pihur V, Datta S, Datta S, et al. (2011) clValid, an R package for cluster validation Guy Brock, Vasyl Pihur, Susmita Datta, and Somnath Datta Department of Bioinformatics and Biostatistics, University of Louisville

  4. Bruhn A, Weickert J, Schnörr C (2005) Lucas/kanade meets horn/schunck: combining local and global optic flow methods. Int J Comput Vision 61(3):211–231

    Article  MATH  Google Scholar 

  5. Campo DN, Stegmayer G, Milone DH (2016) A new index for clustering validation with overlapped clusters. Expert Syst Appl 64:549–556

    Article  Google Scholar 

  6. Chang IC, Chen KY (2007) Content-selection based video summarization. In: 2007 Digest of Technical Papers International Conference on Consumer Electronics, IEEE, pp 1–2

  7. Chau WS, Au OC, Chong TS (2004) Key frame selection by macroblock type and motion vector analysis. In: 2004 IEEE International Conference on Multimedia and Expo (ICME)(IEEE Cat. No. 04TH8763), IEEE, vol 1, pp 575–578

  8. Chheng T (2007) Video summarization using clustering. Department of Computer Science University of California, Irvine

    Google Scholar 

  9. Cirne MVM, Pedrini H (2013) A video summarization method based on spectral clustering. In: Iberoamerican Congress on Pattern Recognition, Springer, pp 479–486

  10. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), Ieee, vol 1, pp 886–893

  11. Danon L, Diaz-Guilera A, Duch J, Arenas A (2005) Comparing community structure identification. J Stat Mech: Theory Ex 09:P09008

    Google Scholar 

  12. Das P, Das AK, Nayak J (2020) Feature selection generating directed rough-spanning tree for crime pattern analysis. Neural Comput Appl 32(12):7623–7639

    Article  Google Scholar 

  13. Deborah LJ, Baskaran R, Kannan A (2010) A survey on internal validity measure for cluster validation. Int J Comput Sci Eng Surv 1(2):85–102

    Article  Google Scholar 

  14. Dhawale CA, Jain S (2008) A novel approach towards keyframe selection for video summarization. Asian J Information Technol 7(4):133–137

    Google Scholar 

  15. Divakaran A, Peker KA, Radhakrishnan R, Xiong Z, Cabasson R (2003) Video summarization using mpeg- motion activity and audio descriptors. Video Mining. Springer, New York, pp 91–121

    Chapter  Google Scholar 

  16. Fajtl J, Sokeh HS, Argyriou V, Monekosso D, Remagnino P (2018) Summarizing videos with attention. In: Asian Conference on Computer Vision, Springer, pp 39–54

  17. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science. 315(5814):972–976

    Article  MathSciNet  MATH  Google Scholar 

  18. Gianluigi C, Raimondo S (2006) An innovative algorithm for key frame extraction in video summarization. J Real-Time Image Process 1(1):69–88

    Article  Google Scholar 

  19. Gong B, Chao WL, Grauman K, Sha F (2014) Diverse sequential subset selection for supervised video summarization. Adv Neural Information Process Syst 27:2069–2077

    Google Scholar 

  20. Gunsel B, Tekalp AM (1998) Content-based video abstraction. In: Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No. 98CB36269), IEEE, pp 128–132

  21. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    MATH  Google Scholar 

  22. Hubert L, Arabie P (1985) Comparing partitions. J Classification 2(1):193–218

    Article  MATH  Google Scholar 

  23. Jadhava P, Jadhav D (2015) Video summarization using higher order color moments. Proceedings of the International Conference on Advanced Computing Technologies and Applications (ICACTA) 45:275–281

  24. Jadon S, Jasim M (2019) Video summarization using keyframe extraction and video skimming. arXiv preprint arXiv:191004792

  25. Li C, Wu YT, Yu SS, Chen T (2009) Motion-focusing key frame extraction and video summarization for lane surveillance system. In: 2009 16th IEEE International Conference on Image Processing (ICIP), IEEE, pp 4329–4332

  26. Liu T, Zhang HJ, Qi F (2003) A novel video key-frame-extraction algorithm based on perceived motion energy model. IEEE Transactions Circuit Syst Video Technol 13(10):1006–1013

    Article  Google Scholar 

  27. Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of internal clustering validation measures. In: 2010 IEEE international conference on data mining, IEEE, pp 911–916

  28. Ma YF, Lu L, Zhang HJ, Li M (2002) A user attention model for video summarization. In: Proceedings of the tenth ACM international conference on Multimedia, pp 533–542

  29. Mundur P, Rao Y, Yesha Y (2006) Keyframe-based video summarization using delaunay clustering. Int J Digital Libr 6(2):219–232

    Article  Google Scholar 

  30. Okade M, Biswas PK (2016) A novel moving object segmentation framework utilizing camera motion recognition for h. 264 compressed videos. J Visual Commun Image Represent 36:199–212

    Article  Google Scholar 

  31. Pei SC, Chou YZ (1999) Efficient mpeg compressed video analysis using macroblock type information. IEEE Transactions Multimedia 1(4):321–333

    Article  Google Scholar 

  32. Rendón E, Abundez I, Arizmendi A, Quiroz EM (2011) Internal versus external cluster validation indexes. Int J Comput Commun 5(1):27–34

    Google Scholar 

  33. Sony A, Ajith K, Thomas K, Thomas T, Deepa P (2011) Video summarization by clustering using euclidean distance. 2011 International Conference on Signal Processing. Communication, Computing and Networking Technologies, IEEE, pp 642–646

  34. Srinivas M, Pai MM, Pai RM (2016) An improved algorithm for video summarization-a rank based approach. Procedia Comput Sci 89:812–819

    Article  Google Scholar 

  35. Sujatha C, Mudenagudi U (2011) A study on keyframe extraction methods for video summary. In: 2011 International Conference on Computational Intelligence and Communication Networks, IEEE, pp 73–77

  36. Tabii Y, Thami R (2009) A new method for soccer video summarizing based on shot detection, classification and finite state machine. In: Proceedings of The 5th international conference SETIT

  37. Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM transactions on multimedia computing, communications, and applications (TOMM) 3(1):3–es

  38. Wilcoxon F, Katti S, Wilcox RA (1970) Critical values and probability levels for the wilcoxon rank sum test and the wilcoxon signed rank test. Sel Tables Math Stat 1:171–259

    MATH  Google Scholar 

  39. Wolf W (1996) Key frame selection by motion analysis. In: 1996 IEEE international conference on acoustics, speech, and signal processing conference proceedings, IEEE, vol 2, pp 1228–1231

  40. Wu J, Zhong Sh, Jiang J, Yang Y (2017) A novel clustering method for static video summarization. Multimedia Tools Appl 76(7):9625–9641

    Article  Google Scholar 

  41. Zhang HJ, Wu J, Zhong D, Smoliar SW (1997) An integrated system for content-based video retrieval and browsing. Patt Recognit 30(4):643–658

    Article  Google Scholar 

  42. Zhou K, Qiao Y, Xiang T (2017) Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. arXiv preprint arXiv:180100054

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asit Kumar Das.

Ethics declarations

Conflict of interest

The authors declare that this manuscript has no conflict of interest with any other published source and has not been published previously (partly or in full). No data have been fabricated or manipulated to support our conclusions.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yasmin, G., Chowdhury, S., Nayak, J. et al. Key moment extraction for designing an agglomerative clustering algorithm-based video summarization framework. Neural Comput & Applic 35, 4881–4902 (2023). https://doi.org/10.1007/s00521-021-06132-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06132-1

Keywords

Navigation