Abstract
At present, the demand for short video generation is increasing, especially for sports news report, which urgently needs automatic video summarization methods to reduce time and labor cost. This paper focuses on NBA basketball videos and seeks for the actual needs of news report on sports video summarization. We propose a hierarchical-grained deep reinforcement learning framework to generate short basketball video. For a long basketball game video, we propose a hierarchical-grained subshot segmentation algorithm, which takes into account both semantics and objective factors, and preserves spatiotemporal consistency. Then we select candidate frames through a news element enhanced deep reinforcement learning framework. On this basis, a news report oriented video summarization algorithm based on probability sampling is implemented with the fusion of multi-game and multi-news elements. Experimental results on the NBA dataset newly collected by us demonstrate the effectiveness of the proposed framework. Moreover, the proposed method is able to highlight the video content including well preserved news elements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Almeida, J., Leite, N.J., Torres, R.d.S.: Vison: video summarization for online applications. Pattern Recogn. Lett. 33(4), 397–409 (2012)
Avila, S., Lopes, A., Luz, A.D., Araújo, A.: VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn. Lett. 32(1), 56–68 (2011)
Chu, W.S., Song, Y., Jaimes, A.: Video co-summarization: video summarization by visual co-occurrence. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3584–3592 (2015)
Cong, Y., Yuan, J., Luo, J.: Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans. Multimedia 14(1), 66–75 (2012)
Ejaz, N., Mehmood, I., Baik, S.W.: Efficient visual attention based framework for extracting key frames from videos. Signal Process. Image Commun. 28(1), 34–44 (2013)
Elhamifar, E., Sapiro, G., Vidal, R.: See all by looking at a few: sparse modeling for finding representative objects. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1600–1607 (2012)
Gong, B., Chao, W.L., Grauman, K., Sha, F.: Diverse sequential subset selection for supervised video summarization. In: Advances in Neural Information Processing Systems, vol. 27, pp. 2069–2077 (2014)
Gruzman, I.S., Kostenkova, A.S.: Algorithm of scene change detection in a video sequence based on the three dimensional histogram of color images. In: 2014 12th International Conference on Actual Problems of Electronics Instrument Engineering (APEIE), p. 1 (2014)
Gygli, M., Grabner, H., Riemenschneider, H., Van Gool, L.: Creating summaries from user videos. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 505–520. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_33
Ji, Z., Xiong, K., Pang, Y., Li, X.: Video summarization with attention-based encoder-decoder networks (2018)
Kuanar, S.K., Panda, R., Chowdhury, A.S.: Video key frame extraction through dynamic delaunay clustering with a structural constraint. J. Vis. Commun. Image Represent. 24(7), 1212–1227 (2013)
Li, X., Zhao, B., Lu, X.: A general framework for edited video and raw video summarization. IEEE Trans. Image Process. 26(8), 3652–3664 (2017)
Lin, J.C., Wei, W.L., Wang, H.M.: Automatic music video generation based on emotion-oriented pseudo song prediction and matching. In: ACM International Conference on Multimedia, pp. 372–376 (2016)
Lin, J.C., Wei, W.L., Yang, J., Wang, H.M., Liao, H.Y.M.: Automatic music video generation based on simultaneous soundtrack recommendation and video editing. In: ACM International Conference on Multimedia, pp. 519–527 (2017)
Liu, D., Hua, G., Chen, T.: A hierarchical visual model for video object summarization. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2178–2190 (2010)
Mahasseni, B., Lam, M., Todorovic, S.: Unsupervised video summarization with adversarial LSTM networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2982–2991 (2017)
Mei, S., Guan, G., Wang, Z., Wan, S., He, M., Feng, D.D.: Video summarization via minimum sparse reconstruction. Pattern Recogn. 48(2), 522–533 (2015)
Money, A.G., Agius, H.: Video summarisation: a conceptual framework and survey of the state of the art. J. Vis. Commun. Image Represent. 19(2), 121–143 (2008)
Potapov, D., Douze, M., Harchaoui, Z., Schmid, C.: Category-specific video summarization. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 540–555. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_35
Rochan, M., Wang, Y.: Video summarization by learning from unpaired data. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7902–7911 (2019)
Sebastian, T., Puthiyidam, J.J.: A survey on video summarization techniques. Int. J. Comput. Appl 132(13), 30–32 (2015)
Sigurdsson, G.A., Chen, X., Gupta, A.: Learning visual storylines with skipping recurrent neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 71–88. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_5
Song, X., et al.: Category driven deep recurrent neural network for video summarization. In: 2016 IEEE International Conference on Multimedia Expo Workshops (ICMEW), pp. 1–6 (2016)
Song, Y., Vallmitjana, J., Stent, A., Jaimes, A.: TVSum: summarizing web videos using titles. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5179–5187 (2015)
Szegedy, C., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)
Tang, T., Jia, J., Mao, H.: Dance with melody: an LSTM-autoencoder approach to music-oriented dance synthesis. In: ACM International Conference on Multimedia, pp. 1598–1606 (2018)
Wang, L., Ho, Y.S., Yoon, K.J., et al.: Event-based high dynamic range image and very high frame rate video generation using conditional generative adversarial networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 10081–10090 (2019)
Wang, M., Hong, R., Li, G., Zha, Z.J., Yan, S., Chua, T.S.: Event driven web video summarization by tag localization and key-shot identification. IEEE Trans. Multimedia 14(4), 975–985 (2012)
Wolf, W.: Key frame selection by motion analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing Conference, vol. 2, pp. 1228–1231 (1996)
Yang, H., Wang, B., Lin, S., Wipf, D., Guo, M., Guo, B.: Unsupervised extraction of video highlights via robust recurrent auto-encoders. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4633–4641 (2015)
Yao, T., Mei, T., Rui, Y.: Highlight detection with pairwise deep ranking for first-person video summarization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 982–990 (2016)
Yu, H., Cheng, S., Ni, B., Wang, M., Zhang, J., Yang, X.: Fine-grained video captioning for sports narrative. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6006–6015 (2018)
Zhang, H.J., Wu, J., Zhong, D., Smoliar, S.W.: An integrated system for content-based video retrieval and browsing. Pattern Recogn. 30(4), 643–658 (1997)
Zhang, K., Chao, W.L., Sha, F., Grauman, K.: Summary transfer: exemplar-based subset selection for video summarization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1059–1067 (2016)
Zhang, K., Chao, W.-L., Sha, F., Grauman, K.: Video summarization with long short-term memory. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 766–782. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_47
Zhao, B., Li, X., Lu, X.: HSA-RNN: hierarchical structure-adaptive RNN for video summarization. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7405–7414 (2018)
Zhao, B., Li, X., Lu, X.: Hierarchical recurrent neural network for video summarization (2019)
Zhao, B., Xing, E.P.: Quasi real-time summarization for consumer videos. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2513–2520 (2014)
Zhou, K., Qiao, Y., Xiang, T.: Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. In: Thirty-Second AAAI Conference on Artificial Intelligence, pp. 7582–7589. AAAI (2018)
Acknowledgments
This work was funded by the Key Research and Development Plan of Zhejiang Province (No. 2019C03131) and the Basic Public Welfare Research Project of Zhejiang Province (No. LGF21F020004).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Ji, N., Zhao, S., Lin, Q., Yu, D., Zhao, Y. (2021). NBA Basketball Video Summarization for News Report via Hierarchical-Grained Deep Reinforcement Learning. In: Peng, Y., Hu, SM., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds) Image and Graphics. ICIG 2021. Lecture Notes in Computer Science(), vol 12890. Springer, Cham. https://doi.org/10.1007/978-3-030-87361-5_58
Download citation
DOI: https://doi.org/10.1007/978-3-030-87361-5_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87360-8
Online ISBN: 978-3-030-87361-5
eBook Packages: Computer ScienceComputer Science (R0)