MFSR: Maximum feature score region-based captions locating in news video images | Machine Intelligence Research Skip to main content
Log in

MFSR: Maximum feature score region-based captions locating in news video images

  • Research Article
  • Published:
International Journal of Automation and Computing Aims and scope Submit manuscript

Abstract

For news video images, caption recognizing is a useful and important step for content understanding. Caption locating is usually the first step of caption recognizing and this paper proposes a simple but effective caption locating algorithm called maximum feature score region (MFSR) based method, which mainly consists of two stages: In the first stage, up/down boundaries are attained by turning to edge map projection. Then, maximum feature score region is defined and left/right boundaries are achieved by utilizing MFSR. Experiments show that the proposed MFSR based method has superior and robust performance on news video images of different types.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. S. Y. Yan, X. X. Xu, Q. S. Liu. Robust text detection in natural scenes using text geometry and visual appearance. International Journal of Automation and Computing, vol. 11, no. 5, pp. 480–488, 2014.

    Article  Google Scholar 

  2. K. Jung, K. I. Kim, A. K. Jain. Text information extractionin images and video: a survey. Pattern Recognition, vol. 37, no. 5, pp. 977–997, 2004.

    Article  Google Scholar 

  3. P. Shivakumara, T. Q. Phan, C. L. Tan. Video text detection based on filters and edge features. In Proceedings of 2009 IEEE International Conference on Multimedia and Expo, IEEE, New York, USA, pp. 514–517, 2009.

    Chapter  Google Scholar 

  4. Y. C. Wei, C. H. Lin. A robust video text detection approach using SVM. Expert Systems with Applications, vol. 39, no. 12, pp. 10832–10840, 2012.

    Article  Google Scholar 

  5. P. Shivakumara, W. H. Huang, T. Q. Phan, C. L. Tan. Accurate video text detection through classification of low andhigh contrast images. Pattern Recognition, vol. 43, no. 6, pp. 2165–2185, 2010.

    Article  Google Scholar 

  6. D. T. Chen, J. M. Odobez, H. Bourlard. Text detection andrecognition in images and video frames. Pattern Recognition, vol. 37, no. 3, pp. 595–608, 2004.

    Article  Google Scholar 

  7. N. Dimitrova, H. J. Zhang, B. Shahraray, I. Sezan, T.Huang, A. Zakhor. Applications of video-content analysisand retrieval. IEEE Multimedia, vol. 9, no. 3, pp. 43–55, 2002.

    Article  Google Scholar 

  8. M. R. Lyu, J. Q. Song, M. Cai. A comprehensive methodfor multilingual video text detection, localization, and extraction. IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, pp. 243–255, 2005.

    Article  Google Scholar 

  9. D. T. Chen, J. M. Odobez, J. P. Thiran. A localization/ verification scheme for finding text in images and video frames based on contrast independent features and machine learning methods. Signal Processing: Image Communication, vol. 19, no. 3, pp. 205–217, 2004.

    Google Scholar 

  10. R. Liehart, A. Wernicke. Localizing and segmenting textin images and videos. IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 4, pp. 256–268, 2002.

    Article  Google Scholar 

  11. C. Jung, Q. F. Liu, J. Kim. A new approach for text segmentation using a stroke filter. Signal Processing, vol. 88, no. 7, pp. 1907–1916, 2008.

    Article  MATH  Google Scholar 

  12. M. Cai, J. Q. Song, M. R. Lyu. A new approach for videotext detection. In Proceedings of 2002 International Conference on Image Processing, IEEE, Rochester, USA, pp. I–117–I–120, 2002.

    Google Scholar 

  13. J. C. Shim, C. Dorai, R. Bolle. Automatic text extractionfrom video for content-based annotation and retrieval. In Proceedings of the 14th International Conference on Pattern Recognition, IEEE, Brisbane, Australia, pp. 618–620, 1998.

    Google Scholar 

  14. J. Q. Yan, X. B. Gao. Detection and recognition of text superimposed in images base on layered method. Neurocomputing, vol. 134, pp. 3–14, 2014.

    Article  Google Scholar 

  15. J. Q. Yan, J. Li, X. B. Gao. Chinese text location undercomplex background using Gabor filter and SVM. Neurocomputing, vol. 74, no. 17, pp. 2998–3008, 2011.

    Article  Google Scholar 

  16. D. T. Chen, K. Shearer, H. Bourlard. Text enhancement with asymmetric filter for video OCR. In Proceedings of the 11th International Conference on Image Analysis and Processing, IEEE, Palermo, Italy, pp. 192–197, 2001.

    Google Scholar 

  17. M. Anthimopoulos, B. Gatos, I. Pratikakis. Multiresolution text detection in video frames. In Proceedings of 2007International Conference on Computer Vision Theory and Applications, Barcelona, Spain, pp. 161–166, 2007.

    Google Scholar 

  18. C. Z. Shi, C. H. Wang, B. H. Xiao, Y. Zhang, S. Gao. Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognition Letters, vol. 34, no. 2, pp. 107–116, 2013.

    Article  Google Scholar 

  19. Q. X. Ye, Q. M. Huang, W. Gao, D. B. Zhao. Fast and robust text detection in images and video frames. Image and Vision Computing, vol. 23, no. 6, pp. 565–576, 2005.

    Article  Google Scholar 

  20. K. I. Kim, K. Jung, J. H. Kim. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 1631–1639, 2003.

    Article  Google Scholar 

  21. M. Anthimopoulos, B. Gatos, I. Pratikakis. A two-stage scheme for text detection in video images. Image and Vision Computing, vol. 28, no. 9, pp. 1413–1426, 2010.

    Article  Google Scholar 

  22. H. G. Zhang, K. L. Zhao, Y. Z. Song, J. Guo, Text extraction from natural scene image: A survey. Neurocomputing, vol. 122, pp. 310–323, 2013.

    Article  Google Scholar 

  23. H. Huang, P. Shi, L. W. Yang. A method of caption location and segmentation in news video. In Proceedings of the 7th International Congress on Image and Signal Processing, IEEE, Dalian, China, pp. 365–369, 2014.

    Google Scholar 

Download references

Acknowledgement

This work was supported by National Natural Science Foundation of China (Nos. 61272394, 61201395 and 61472119), the program for Science & Technology Innovation Talents in Universities of Henan Province (No. 13HASTIT039), Henan Polytechnic University Innovative Research Team (No. T2014-3), and Henan Polytechnic University Fund for Distinguished Young Scholars (No. J2013-2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong-Min Liu.

Additional information

Recommended by Associate Editor Victor Becerra

Zhi-Heng Wang received the B. Sc. degree in mechatronic engineering from Beijing Institute of Technology, China in 2004, and the Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences, China in 2009. Currently, he is an associate professor at School of Computer Science and Technique, Henan Polytechnic University, China.

His research interests include computer vision, pattern recognition, and image processing.

Chao Guo received the B. Sc. degree from Henan Polytechnic University, China in 2013. Currently, he is a master student at School of Computer Science and Technology, Henan Polytechnic University, China.

His research interests include image processing.

Hong-Min Liu received the B. Sc. degree in electrical &information engineering from Xi’dian University, China in 2004, and her Ph.D. degree from the Institute of Electronics, Chinese Academy of Sciences, China in 2009. Currently, she works as an associate professor at School of Computer Science and Technique, Henan Polytechnic University, China.

Her research interests include image processing, especially on feature detection and matching.

Zhan-Qiang Huo received the B. Sc. degree in mathematics and applied mathematics from the Hebei Normal University of Science & Technology, China in 2003. He received his M. Sc. degree in computer software and theory in 2006 and the Ph.D. degree in circuit and system in 2009 from Yanshan University, China. Currently, he is an associate professor in the college of computer science and technology at Henan Polytechnic University, China. He has published about 20 refereed journal and conference papers.

His research interests include computer software and theory, queuing systems and digital image processing.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, ZH., Guo, C., Liu, HM. et al. MFSR: Maximum feature score region-based captions locating in news video images. Int. J. Autom. Comput. 15, 454–461 (2018). https://doi.org/10.1007/s11633-015-0943-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-015-0943-5

Keywords

Navigation