MFSR: Maximum feature score region-based captions locating in news video images

Wang, Zhi-Heng; Guo, Chao; Liu, Hong-Min; Huo, Zhan-Qiang

doi:10.1007/s11633-015-0943-5

MFSR: Maximum feature score region-based captions locating in news video images

Research Article
Published: 11 January 2016

Volume 15, pages 454–461, (2018)
Cite this article

International Journal of Automation and Computing Aims and scope Submit manuscript

86 Accesses
1 Altmetric
Explore all metrics

Abstract

For news video images, caption recognizing is a useful and important step for content understanding. Caption locating is usually the first step of caption recognizing and this paper proposes a simple but effective caption locating algorithm called maximum feature score region (MFSR) based method, which mainly consists of two stages: In the first stage, up/down boundaries are attained by turning to edge map projection. Then, maximum feature score region is defined and left/right boundaries are achieved by utilizing MFSR. Experiments show that the proposed MFSR based method has superior and robust performance on news video images of different types.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Dense Image Captioning Based on Precise Feature Extraction

A novel key point based ROI segmentation and image captioning using guidance information

Article 12 September 2024

SATNet: Captioning with Semantic Alignment and Feature Enhancement

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

S. Y. Yan, X. X. Xu, Q. S. Liu. Robust text detection in natural scenes using text geometry and visual appearance. International Journal of Automation and Computing, vol. 11, no. 5, pp. 480–488, 2014.
Article Google Scholar
K. Jung, K. I. Kim, A. K. Jain. Text information extractionin images and video: a survey. Pattern Recognition, vol. 37, no. 5, pp. 977–997, 2004.
Article Google Scholar
P. Shivakumara, T. Q. Phan, C. L. Tan. Video text detection based on filters and edge features. In Proceedings of 2009 IEEE International Conference on Multimedia and Expo, IEEE, New York, USA, pp. 514–517, 2009.
Chapter Google Scholar
Y. C. Wei, C. H. Lin. A robust video text detection approach using SVM. Expert Systems with Applications, vol. 39, no. 12, pp. 10832–10840, 2012.
Article Google Scholar
P. Shivakumara, W. H. Huang, T. Q. Phan, C. L. Tan. Accurate video text detection through classification of low andhigh contrast images. Pattern Recognition, vol. 43, no. 6, pp. 2165–2185, 2010.
Article Google Scholar
D. T. Chen, J. M. Odobez, H. Bourlard. Text detection andrecognition in images and video frames. Pattern Recognition, vol. 37, no. 3, pp. 595–608, 2004.
Article Google Scholar
N. Dimitrova, H. J. Zhang, B. Shahraray, I. Sezan, T.Huang, A. Zakhor. Applications of video-content analysisand retrieval. IEEE Multimedia, vol. 9, no. 3, pp. 43–55, 2002.
Article Google Scholar
M. R. Lyu, J. Q. Song, M. Cai. A comprehensive methodfor multilingual video text detection, localization, and extraction. IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, pp. 243–255, 2005.
Article Google Scholar
D. T. Chen, J. M. Odobez, J. P. Thiran. A localization/ verification scheme for finding text in images and video frames based on contrast independent features and machine learning methods. Signal Processing: Image Communication, vol. 19, no. 3, pp. 205–217, 2004.
Google Scholar
R. Liehart, A. Wernicke. Localizing and segmenting textin images and videos. IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 4, pp. 256–268, 2002.
Article Google Scholar
C. Jung, Q. F. Liu, J. Kim. A new approach for text segmentation using a stroke filter. Signal Processing, vol. 88, no. 7, pp. 1907–1916, 2008.
Article MATH Google Scholar
M. Cai, J. Q. Song, M. R. Lyu. A new approach for videotext detection. In Proceedings of 2002 International Conference on Image Processing, IEEE, Rochester, USA, pp. I–117–I–120, 2002.
Google Scholar
J. C. Shim, C. Dorai, R. Bolle. Automatic text extractionfrom video for content-based annotation and retrieval. In Proceedings of the 14th International Conference on Pattern Recognition, IEEE, Brisbane, Australia, pp. 618–620, 1998.
Google Scholar
J. Q. Yan, X. B. Gao. Detection and recognition of text superimposed in images base on layered method. Neurocomputing, vol. 134, pp. 3–14, 2014.
Article Google Scholar
J. Q. Yan, J. Li, X. B. Gao. Chinese text location undercomplex background using Gabor filter and SVM. Neurocomputing, vol. 74, no. 17, pp. 2998–3008, 2011.
Article Google Scholar
D. T. Chen, K. Shearer, H. Bourlard. Text enhancement with asymmetric filter for video OCR. In Proceedings of the 11th International Conference on Image Analysis and Processing, IEEE, Palermo, Italy, pp. 192–197, 2001.
Google Scholar
M. Anthimopoulos, B. Gatos, I. Pratikakis. Multiresolution text detection in video frames. In Proceedings of 2007International Conference on Computer Vision Theory and Applications, Barcelona, Spain, pp. 161–166, 2007.
Google Scholar
C. Z. Shi, C. H. Wang, B. H. Xiao, Y. Zhang, S. Gao. Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognition Letters, vol. 34, no. 2, pp. 107–116, 2013.
Article Google Scholar
Q. X. Ye, Q. M. Huang, W. Gao, D. B. Zhao. Fast and robust text detection in images and video frames. Image and Vision Computing, vol. 23, no. 6, pp. 565–576, 2005.
Article Google Scholar
K. I. Kim, K. Jung, J. H. Kim. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 1631–1639, 2003.
Article Google Scholar
M. Anthimopoulos, B. Gatos, I. Pratikakis. A two-stage scheme for text detection in video images. Image and Vision Computing, vol. 28, no. 9, pp. 1413–1426, 2010.
Article Google Scholar
H. G. Zhang, K. L. Zhao, Y. Z. Song, J. Guo, Text extraction from natural scene image: A survey. Neurocomputing, vol. 122, pp. 310–323, 2013.
Article Google Scholar
H. Huang, P. Shi, L. W. Yang. A method of caption location and segmentation in news video. In Proceedings of the 7th International Congress on Image and Signal Processing, IEEE, Dalian, China, pp. 365–369, 2014.
Google Scholar

Download references

Acknowledgement

This work was supported by National Natural Science Foundation of China (Nos. 61272394, 61201395 and 61472119), the program for Science & Technology Innovation Talents in Universities of Henan Province (No. 13HASTIT039), Henan Polytechnic University Innovative Research Team (No. T2014-3), and Henan Polytechnic University Fund for Distinguished Young Scholars (No. J2013-2).

Author information

Authors and Affiliations

School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, 454003, China
Zhi-Heng Wang, Chao Guo, Hong-Min Liu & Zhan-Qiang Huo

Authors

Zhi-Heng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Guo
View author publications
You can also search for this author in PubMed Google Scholar
Hong-Min Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhan-Qiang Huo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hong-Min Liu.

Additional information

Recommended by Associate Editor Victor Becerra

Zhi-Heng Wang received the B. Sc. degree in mechatronic engineering from Beijing Institute of Technology, China in 2004, and the Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences, China in 2009. Currently, he is an associate professor at School of Computer Science and Technique, Henan Polytechnic University, China.

His research interests include computer vision, pattern recognition, and image processing.

Chao Guo received the B. Sc. degree from Henan Polytechnic University, China in 2013. Currently, he is a master student at School of Computer Science and Technology, Henan Polytechnic University, China.

His research interests include image processing.

Hong-Min Liu received the B. Sc. degree in electrical &information engineering from Xi’dian University, China in 2004, and her Ph.D. degree from the Institute of Electronics, Chinese Academy of Sciences, China in 2009. Currently, she works as an associate professor at School of Computer Science and Technique, Henan Polytechnic University, China.

Her research interests include image processing, especially on feature detection and matching.

Zhan-Qiang Huo received the B. Sc. degree in mathematics and applied mathematics from the Hebei Normal University of Science & Technology, China in 2003. He received his M. Sc. degree in computer software and theory in 2006 and the Ph.D. degree in circuit and system in 2009 from Yanshan University, China. Currently, he is an associate professor in the college of computer science and technology at Henan Polytechnic University, China. He has published about 20 refereed journal and conference papers.

His research interests include computer software and theory, queuing systems and digital image processing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, ZH., Guo, C., Liu, HM. et al. MFSR: Maximum feature score region-based captions locating in news video images. Int. J. Autom. Comput. 15, 454–461 (2018). https://doi.org/10.1007/s11633-015-0943-5

Download citation

Received: 15 January 2015
Accepted: 14 March 2015
Published: 11 January 2016
Issue Date: August 2018
DOI: https://doi.org/10.1007/s11633-015-0943-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

MFSR: Maximum feature score region-based captions locating in news video images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Dense Image Captioning Based on Precise Feature Extraction

A novel key point based ROI segmentation and image captioning using guidance information

SATNet: Captioning with Semantic Alignment and Feature Enhancement

Explore related subjects

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now