A detection method for low-pixel ratio object | Multimedia Tools and Applications Skip to main content
Log in

A detection method for low-pixel ratio object

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Low-pixel object detection is a kind of difficult program. Existing object detection benchmarks and methods mainly focus on standard detection task. However, these way cannot get good performance on low-pixel ratio object detection, which has a few pixel in high resolution images. In order to solve it, we propose a new deep learning framework. This framework improves Faster R-CNN by combining multiple level feature map and optimizing anchor size for bounding box recognition. In order to validate our approach, we collect and annotate a dataset for road garbage detection, which contains 801 images and 966 bounding boxes. Experiments demonstrate that our framework outperforms other state-of-the-art detection methods. What’s more, our method can apply on road garbage target.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Auli M, Galley M, Quirk C, et al (2013) Joint language and translation modeling with recurrent neural networks. Am J Psychoanal 74(2):212–3

    Google Scholar 

  2. Bell S, Zitnick CL, Bala K, et al (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883

  3. Chen C, Liu MY, Tuzel O, et al (2016) R-cnn for small object detection. In: Asian conference on computer vision. Springer, Cham, pp 214–230

    Chapter  Google Scholar 

  4. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  5. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition, CVPR, vol 1. IEEE, pp 886–893

  6. Deng J, Dong W, Socher R, et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition CVPR. IEEE, pp 248–255

  7. Everingham M, Gool LV, Williams CKI, et al (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  8. Gao Y, Ma J, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 26(5):2545–2560

    Article  MathSciNet  Google Scholar 

  9. Gao L, Guo Z, Zhang H, et al (2017) Video captioning with attention-based lstm and semantic consistency. IEEE Trans Multimed 19(9):2045–2055

    Article  Google Scholar 

  10. Gidaris S, Komodakis N (2015) Object detection via a multi-region and semantic segmentation-aware cnn model. In: Proceedings of the IEEE international conference on computer vision, pp 1134–1142

  11. Gidaris S, Komodakis N (2016) Locnet: improving localization accuracy for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 789–798

  12. Girshick R (2015) Fast r-cnn. Computer Science

  13. Girshick R, Donahue J, Darrell T, et al (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158

    Article  Google Scholar 

  14. Gkioxari G, Girshick R, Malik J (2015) Contextual action recognition with r* cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1080–1088

  15. Guo Z, Gao L, Song J, et al (2016) Attention-based LSTM with semantic consistency for videos captioning. In: Proceedings of the 2016 ACM on multimedia conference. ACM, pp 357–361

  16. He K, Gkioxari G, Dollr P, et al (2017) Mask r-cnn. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 2980–2988

  17. He K, Zhang X, Ren S, et al (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision. Springer, Cham, pp 346–361

    Google Scholar 

  18. Ho TK (1995) Random decision forests. In: Proceedings of the third international conference on document analysis and recognition, vol 1. IEEE, pp 278–282

  19. Li Y, Zhang Y (2018) Robust infrared small target detection using local steering kernel reconstruction. Pattern Recogn 77:113–125

    Article  Google Scholar 

  20. Li J, Liang X, Wei Y, et al (2017) Perceptual generative adversarial networks for small object detection. In: IEEE CVPR

  21. Li Y, Zhang Y, Huang X, et al (2018) Large-scale remote sensing image retrieval by deep hashing neural networks. IEEE Trans Geosci Remote Sens 56 (2):950–965

    Article  Google Scholar 

  22. Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: common objects in context., 8693, 740-755. (2014, September). Microsoft coco: Common objects in context. In: European conference on computer vision. Springer, Cham, pp 740–755

    Google Scholar 

  23. Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2. IEEE, pp 1150–1157

  24. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  25. Minaeian S, Liu J, Son YJ (2018) Effective and efficient detection of moving targets from a UAV’s Camera. IEEE transactions on intelligent transportation systems

  26. Najibi M, Rastegari M, Davis LS (2016) G-cnn: an iterative grid based object detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2369–2377

  27. Nguyen LD, Lin D, Lin Z, et al (2018) Deep CNNs for microscopic image classification by exploiting transfer learning and feature concatenation. In: 2018 IEEE international symposium on circuits and systems (ISCAS). IEEE, pp 1–5

  28. Ojala T, Pietikainen M, Maenpaa T (2000) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987

    Article  Google Scholar 

  29. Redmon J, Divvala S, Girshick R, et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  30. Ren S, He K, Girshick R, et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  31. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823

  32. Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 761–769

  33. Tian T, Li C, Xu J, et al (2018) Urban area detection in very high resolution remote sensing images using deep convolutional neural networks. Sensors 18(3):904

    Article  Google Scholar 

  34. Uijlings JR, Sande KE, Gevers T, et al (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  35. Wang X, Gao L, Song J, et al (2017) Beyond frame-level cnn: saliency-aware 3-d cnn with lstm for video action recognition. IEEE Signal Process Lett 24(4):510–514

    Article  Google Scholar 

  36. Wang X, Gao L, Song J, et al (2018) Deep appearance and motion learning for egocentric activity recognition. Neurocomputing 275:438–447

    Article  Google Scholar 

  37. Wang X, Gao L, Wang P, et al (2018) Two-stream 3-d convnet fusion for action recognition in videos with arbitrary size and length. IEEE Trans Multimed 20 (3):634–644

    Article  Google Scholar 

  38. Xu D, Ricci E, Yan Y, et al (2015) Learning deep representations of appearance and motion for anomalous event detection. BMVC

  39. Yabuki N, Nishimura N, Fukuda T (2018) Automatic object detection from digital images by deep learning with transfer learning. In: Workshop of the European group for intelligent computing in engineering. Springer, Cham, pp 3–15

    Chapter  Google Scholar 

  40. Yang Z, Yu W, Liang P, et al (2018) Deep transfer learning for military object recognition under small training set condition. Neural Comput Applic, 1–10

  41. Zhu Y, Urtasun R, Salakhutdinov R, et al (2015) segdeepm: exploiting segmentation and context in deep neural networks for object detection. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 4703–4711

  42. Zitnick CL, Dollr P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision. Springer, Cham, pp 391–405

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61671423 and Grant No. 61271403.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong Yin.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, R., Yin, D., Ding, J. et al. A detection method for low-pixel ratio object. Multimed Tools Appl 78, 11655–11674 (2019). https://doi.org/10.1007/s11042-018-6653-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6653-6

Keywords

Navigation