An improved Yolov5 real-time detection method for small objects captured by UAV | Soft Computing Skip to main content
Log in

An improved Yolov5 real-time detection method for small objects captured by UAV

  • Mathematical methods in data science
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The object detection algorithm is mainly focused on detection in general scenarios, when the same algorithm is applied to drone-captured scenes, and the detection performance of the algorithm will be significantly reduced. Our research found that small objects are the main reason for this phenomenon. In order to verify this finding, we choose the yolov5 model and propose four methods to improve the detection precision of small object based on it. At the same time, considering that the model needs to be small in size, speed fast, low cost and easy to deploy in actual application, therefore, when designing these four methods, we also fully consider the impact of these methods on the detection speed. The model integrating all the improved methods not only greatly improves the detection precision, but also effectively reduces the loss of detection speed. Finally, based on VisDrone-2020, the mAP of our model is increased from 12.7 to 37.66%, and the detection speed is up to 55FPS. It is to outperform the earlier state of the art in detection speed and promote the progress of object detection algorithms on drone platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934

  • Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 6154–6162

  • Chu J, Guo Z, Leng L (2018) Object detection based on multi-layer convolution feature fusion and online hard example mining. IEEE Access 6:19959–19967

    Article  Google Scholar 

  • Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255

  • Du D, Zhu (2019) Visdrone-det2019: the vision meets drone object detection in image challenge results. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). pp 213–226. https://doi.org/10.1109/ICCVW.2019.00030

  • Du H, Wang Z, Zhan W, Guo J (2018) Elitism and distance strategy for selection of evolutionary algorithms. IEEE Access 6:44531–44541

    Article  Google Scholar 

  • Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 6569–6578

  • Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778

  • Henderson P, Ferrari V (2016) End-to-end training of object class detectors for mean average precision. In: Asian conference on computer vision. Springer, pp 198–213

  • Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7132–7141

  • Jaiswal D, Kumar P (2019) Real-time implementation of moving object detection in UAV videos using GPUS. J Real-Time Image Process 17:1–17

    Google Scholar 

  • Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection. arXiv preprint arXiv:1902.07296

  • Kurdthongmee W (2019) Speeding up inference on deep neural networks for object detection by performing partial convolution. J Real-Time Image Process 17:1–17

    Google Scholar 

  • Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV). pp 734–750

  • Lee MH, Yeom S (2018) Detection and tracking of multiple moving vehicles with a UAV. Int J Fuzzy Logic Intell Syst 18(3):182–189

    Article  Google Scholar 

  • Li J, Liang X, Wei Y, Xu T, Feng J, Yan S (2017a) Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1222–1230

  • Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2017b) Light-head R-CNN: in defense of two-stage object detector. arXiv preprint arXiv:1711.07264

  • Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2018) Detnet: design backbone for object detection. In: Proceedings of the European conference on computer vision (ECCV). pp 334–350

  • Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755

  • Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017a) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2117–2125

  • Lin TY, Goyal P, Girshick R, He K, Dollár P (2017b) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision. pp 2980–2988

  • Nam D, Yeom S (2020) Moving vehicle detection and drone velocity estimation with a moving drone. Int J Fuzzy Logic Intell Syst 20(1):43–51

    Article  Google Scholar 

  • Petersen SE, Posner MI (2012) The attention system of the human brain: 20 years after. Annu Rev Neurosci 35:73–89

    Article  Google Scholar 

  • Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 658–666

  • Rukhovich D, Sofiiuk K, Galeev D, Barinova O, Konushin A (2020) Iterdet: iterative scheme for object detection in crowded environments. arXiv preprint arXiv:2005.05708

  • Stojnić V, Risojević V, Muštra M, Jovanović V, Filipi J, Kezić N, Babić Z (2021) A method for detection of small moving objects in UAV videos. Remote Sens 13(4):653

    Article  Google Scholar 

  • Sun C, Zhan W, She J, Zhang Y (2020) Object detection from the video taken by drone via convolutional neural networks. Math Probl Eng 2020

  • Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 10781–10790

  • Tijtgat N, Van Ranst W, Goedeme T, Volckaert B, De Turck F (2017) Embedded real-time object detection for a UAV warning system. In: Proceedings of the IEEE international conference on computer vision workshops. pp 2110–2118

  • Tong K, Wu Y, Zhou F (2020) Recent advances in small object detection based on deep learning: a review. Image Vis Comput 97:103910

    Article  Google Scholar 

  • Wang K, Liew JH, Zou Y, Zhou D, Feng J (2019) Panet: few-shot image semantic segmentation with prototype alignment. In: Proceedings of the IEEE international conference on computer vision. pp 9197–9206

  • Wang CY, Mark Liao HY, Wu YH, Chen PY, Hsieh JW, Yeh IH (2020) Cspnet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. pp 390–391

  • Yoder J, Priebe CE (2016) Semi-supervised k-means++. arXiv preprint arXiv:1602.00360

  • Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y (2019) Cutmix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE international conference on computer vision. pp 6023–6032

  • Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212

  • Zhang Y, Chen Y, Huang C, Gao M (2019) Object detection network based on feature fusion and attention mechanism. Future Internet 11(1):9

    Article  Google Scholar 

  • Zhang X, Wu J, Peng Z, Meng M (2020a) Sodnet: small object detection using deconvolutional neural network. IET Image Process 14(8):1662–1669

  • Zhang Z, Zhan W, He Z, Zou Y (2020b) Application of spatio-temporal context and convolution neural network (CNN) in grooming behavior of bactrocera minax (diptera: trypetidae) detection and statistics. Insects 11(9):565

  • Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: faster and better learning for bounding box regression. In: AAAI. pp 12993–13000

Download references

Acknowledgements

The authors will thank Professor Wei Zhan for providing the Artificial Intelligence Laboratory as well as the guidance on the paper writing.

Funding

Funding was provided by China Postdoctoral Science Foundation (Grant No. 2019TQ0291), Aeronautical Science Fund (Grant No. 2018ZCZ2002), Natural Science Foundation of Hubei Province (CN) (Grant No. 2019CFB376), the second batch of Chinese University industry research innovation foundation “new generation information technology innovation project” (Grant No. 2019ITA03004), Jingzhou Science and Technology Development Plan Project (Grant No. 2018024).

Author information

Authors and Affiliations

Authors

Contributions

CS and WZ done conceptualization; WZ and MW performed methodology; CS done software; CS, YS and YZ were involved in validation; JS done formal analysis; CS and WZ done writing review and editing; all authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Wei Zhan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhan, W., Sun, C., Wang, M. et al. An improved Yolov5 real-time detection method for small objects captured by UAV. Soft Comput 26, 361–373 (2022). https://doi.org/10.1007/s00500-021-06407-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-06407-8

Keywords

Navigation