Abstract
Low-pixel object detection is a kind of difficult program. Existing object detection benchmarks and methods mainly focus on standard detection task. However, these way cannot get good performance on low-pixel ratio object detection, which has a few pixel in high resolution images. In order to solve it, we propose a new deep learning framework. This framework improves Faster R-CNN by combining multiple level feature map and optimizing anchor size for bounding box recognition. In order to validate our approach, we collect and annotate a dataset for road garbage detection, which contains 801 images and 966 bounding boxes. Experiments demonstrate that our framework outperforms other state-of-the-art detection methods. What’s more, our method can apply on road garbage target.









Similar content being viewed by others
References
Auli M, Galley M, Quirk C, et al (2013) Joint language and translation modeling with recurrent neural networks. Am J Psychoanal 74(2):212–3
Bell S, Zitnick CL, Bala K, et al (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883
Chen C, Liu MY, Tuzel O, et al (2016) R-cnn for small object detection. In: Asian conference on computer vision. Springer, Cham, pp 214–230
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition, CVPR, vol 1. IEEE, pp 886–893
Deng J, Dong W, Socher R, et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition CVPR. IEEE, pp 248–255
Everingham M, Gool LV, Williams CKI, et al (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Gao Y, Ma J, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 26(5):2545–2560
Gao L, Guo Z, Zhang H, et al (2017) Video captioning with attention-based lstm and semantic consistency. IEEE Trans Multimed 19(9):2045–2055
Gidaris S, Komodakis N (2015) Object detection via a multi-region and semantic segmentation-aware cnn model. In: Proceedings of the IEEE international conference on computer vision, pp 1134–1142
Gidaris S, Komodakis N (2016) Locnet: improving localization accuracy for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 789–798
Girshick R (2015) Fast r-cnn. Computer Science
Girshick R, Donahue J, Darrell T, et al (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Gkioxari G, Girshick R, Malik J (2015) Contextual action recognition with r* cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1080–1088
Guo Z, Gao L, Song J, et al (2016) Attention-based LSTM with semantic consistency for videos captioning. In: Proceedings of the 2016 ACM on multimedia conference. ACM, pp 357–361
He K, Gkioxari G, Dollr P, et al (2017) Mask r-cnn. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 2980–2988
He K, Zhang X, Ren S, et al (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision. Springer, Cham, pp 346–361
Ho TK (1995) Random decision forests. In: Proceedings of the third international conference on document analysis and recognition, vol 1. IEEE, pp 278–282
Li Y, Zhang Y (2018) Robust infrared small target detection using local steering kernel reconstruction. Pattern Recogn 77:113–125
Li J, Liang X, Wei Y, et al (2017) Perceptual generative adversarial networks for small object detection. In: IEEE CVPR
Li Y, Zhang Y, Huang X, et al (2018) Large-scale remote sensing image retrieval by deep hashing neural networks. IEEE Trans Geosci Remote Sens 56 (2):950–965
Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: common objects in context., 8693, 740-755. (2014, September). Microsoft coco: Common objects in context. In: European conference on computer vision. Springer, Cham, pp 740–755
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2. IEEE, pp 1150–1157
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Minaeian S, Liu J, Son YJ (2018) Effective and efficient detection of moving targets from a UAV’s Camera. IEEE transactions on intelligent transportation systems
Najibi M, Rastegari M, Davis LS (2016) G-cnn: an iterative grid based object detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2369–2377
Nguyen LD, Lin D, Lin Z, et al (2018) Deep CNNs for microscopic image classification by exploiting transfer learning and feature concatenation. In: 2018 IEEE international symposium on circuits and systems (ISCAS). IEEE, pp 1–5
Ojala T, Pietikainen M, Maenpaa T (2000) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Redmon J, Divvala S, Girshick R, et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren S, He K, Girshick R, et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 761–769
Tian T, Li C, Xu J, et al (2018) Urban area detection in very high resolution remote sensing images using deep convolutional neural networks. Sensors 18(3):904
Uijlings JR, Sande KE, Gevers T, et al (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Wang X, Gao L, Song J, et al (2017) Beyond frame-level cnn: saliency-aware 3-d cnn with lstm for video action recognition. IEEE Signal Process Lett 24(4):510–514
Wang X, Gao L, Song J, et al (2018) Deep appearance and motion learning for egocentric activity recognition. Neurocomputing 275:438–447
Wang X, Gao L, Wang P, et al (2018) Two-stream 3-d convnet fusion for action recognition in videos with arbitrary size and length. IEEE Trans Multimed 20 (3):634–644
Xu D, Ricci E, Yan Y, et al (2015) Learning deep representations of appearance and motion for anomalous event detection. BMVC
Yabuki N, Nishimura N, Fukuda T (2018) Automatic object detection from digital images by deep learning with transfer learning. In: Workshop of the European group for intelligent computing in engineering. Springer, Cham, pp 3–15
Yang Z, Yu W, Liang P, et al (2018) Deep transfer learning for military object recognition under small training set condition. Neural Comput Applic, 1–10
Zhu Y, Urtasun R, Salakhutdinov R, et al (2015) segdeepm: exploiting segmentation and context in deep neural networks for object detection. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 4703–4711
Zitnick CL, Dollr P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision. Springer, Cham, pp 391–405
Acknowledgements
This work is supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61671423 and Grant No. 61271403.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, R., Yin, D., Ding, J. et al. A detection method for low-pixel ratio object. Multimed Tools Appl 78, 11655–11674 (2019). https://doi.org/10.1007/s11042-018-6653-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6653-6