Multi-resolution feature perception network for UAV person re-identification | Multimedia Tools and Applications Skip to main content
Log in

Multi-resolution feature perception network for UAV person re-identification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Person re-identification (re-id) with unmanned aerial vehicles (UAVs) is of great significance in intelligent surveillance. However, recognizing a person of interest from UAVs is more challenging than existing person re-id tasks across multiple fixed cameras. The images taken by UAVs have large resolution variations and complex backgrounds due to the rapid movement and constantly changing flight altitudes of UAVs. Some methods propose cross-resolution learning for person images captured by fixed cameras, assuming query images with low resolution (LR) and gallery images with high resolution (HR). However, they are incapable of handing the resolution variations in UAV scenarios, where both query and gallery images are with significant resolution variations. In this paper, we present a novel multi-resolution feature perception network (MRFPN) to learn discriminative and resolution-robust feature for UAV person re-id. Firstly, we introduce a self-attention module to capture the full-image context information in pixel level and obtain the pixel context-aware feature map for both HR and LR images, which can effectively deal with the background clutters. Secondly, we construct a cross-attention module to learn resolution-robust representations by bi-directionally perceiving the resolution-guided semantic information between HR and LR features. Further, we design a semantic consistency constraint to limit the difference of HR and LR features. Extensive experiments show the superiority of our method on both UAV and traditional datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability Statements

All data included in this study are available from the corresponding author on reasonable request.

References

  1. Liao K, Wang K, Zheng Y, Lin G, Cao C (2023) Multi-scale saliency features fusion model for person re-identification. Multimedia Tools and Applications

  2. Cheng R, Wang L, Wei M, Tian C (2022) Joint learning dynamic pruning and attention for person re-identification. Multimedia Tools and Applications, 1–21

  3. Munir A, Martinel N, Micheloni C (2022) Consistent attentive dual branch network for person re-identification. Multimedia Tools and Applications, 1–18

  4. Zhang S, Zhang Q, Yang Y, Wei X, Wang P, Jiao B, Zhang Y (2020) Person re-identification in aerial imagery. IEEE Trans Multimedia 23:281–291

    Article  Google Scholar 

  5. Li T, Liu J, Zhang W, Ni Y, Wang W, Li Z (2021) Uav-human: a large benchmark for human behavior understanding with unmanned aerial vehicles. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16266–16275

  6. Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell 44(6):2872–2893

    Article  Google Scholar 

  7. Luo H, Gu Y, Liao X, Lai S, Jiang W (2019) Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 4321–4329

  8. Zheng Z, Zheng L, Yang Y (2017) A discriminatively learned cnn embedding for person reidentification. ACM transactions on multimedia computing, communications, and applications (TOMM) 14(1):1–20

    Google Scholar 

  9. Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV), pp 480–496

  10. Fu Y, Wei Y, Zhou Y, Shi H, Huang G, Wang X, Yao Z, Huang T (2019) Horizontal pyramid matching for person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8295–8302

  11. Kalayeh MM, Basaran E, Gökmen M, Kamasak ME, Shah M (2018) Human semantic parsing for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1062–1071

  12. Su C, Li J, Zhang S, Xing J, Gao W, Tian Q (2017) Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3960–3969

  13. Zhao L, Li X, Zhuang Y, Wang J (2017) Deeply-learned part-aligned representations for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3219–3228

  14. Zhang G, Yang J, Zheng Y, Wang Y, Wu Y, Chen S (2021) Hybrid-attention guided network with multiple resolution features for person re-identification. Inf Sci 578:525–538

    Article  MathSciNet  Google Scholar 

  15. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19

  16. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929

  17. Yuan L, Chen Y, Wang T, Yu W, Shi Y, Jiang Z-H, Tay FE, Feng J, Yan S (2021) Tokens-to-token vit: training vision transformers from scratch on imagenet. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 558–567

  18. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229

  19. Jiao J, Zheng W-S, Wu A, Zhu X, Gong S (2018) Deep low-resolution person re-identification. Proceedings of the AAAI conference on artificial intelligence, vol 32, pp 6967–6874

  20. Zheng W-S, Hong J, Jiao J, Wu A, Zhu X, Gong S, Qin J, Lai J (2022) Joint bilateral-resolution identity modeling for cross-resolution person re-identification. Int J Comput Vision 130(1):136–156

    Article  Google Scholar 

  21. Li Y-J, Chen Y-C, Lin Y-Y, Du X, Wang Y-CF (2019) Recover and identify: a generative dual model for cross-resolution person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8090–8099

  22. Chen Y-C, Li Y-J, Du X, Wang Y-CF (2019) Learning resolution-invariant deep representations for person re-identification. Proceedings of the AAAI conference on artificial intelligence vol 33, pp 8215–8222

  23. He S, Luo H, Wang P, Wang F, Li H, Jiang W (2021) Transreid: transformer-based object re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 15013–15022

  24. Zhu K, Guo H, Zhang S, Wang Y, Huang G, Qiao H, Liu J, Wang J, Tang M (2021) Aaformer: auto-aligned transformer for person re-identification. arXiv:2104.00921

  25. Li Y, He J, Zhang T, Liu X, Zhang Y, Wu F (2021) Diverse part discovery: occluded person re-identification with part-aware transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2898–2907

  26. Zhang G, Zhang P, Qi J, Lu H (2021) Hat: hierarchical aggregation transformers for person re-identification. In: Proceedings of the 29th ACM international conference on multimedia, pp 516–525

  27. Grigorev A, Tian Z, Rho S, Xiong J, Liu S (2019) Jiang F (2019) Deep person re-identification in uav images. EURASIP Journal on Advances in Signal Processing 2019(1):1–10

    Article  Google Scholar 

  28. Kumar SA, Yaghoubi E, Das A, Harish B, Proença H (2020) The p-destre: a fully annotated dataset for pedestrian detection, tracking, and short/long-term re-identification from aerial devices. IEEE Trans Inf Forensics Secur 16:1696–1708

    Article  Google Scholar 

  29. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  30. Hendrycks D, Gimpel K (2016) Gaussian error linear units (gelus). arXiv:1606.08415

  31. Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv:1607.06450

  32. Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv:1703.07737

  33. Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124

  34. Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: Proceedings of the IEEE international conference on computer vision, pp 3754–3762

  35. Wei L, Zhang S, Gao W, Tian Q (2018) Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 79–88

  36. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  37. Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM international conference on multimedia, pp 274–282

  38. Zheng Z, Yang X, Yu Z, Zheng L, Yang Y, Kautz J (2019) Joint discriminative and generative learning for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2138–2147

  39. Zhou K, Yang Y, Cavallaro A, Xiang T (2019) Omni-scale feature learning for person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3702–3712

  40. Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2019) Interaction-and-aggregation network for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9317–9326

  41. Tay C-P, Roy S, Yap K-H (2019) Aanet: attribute attention network for person re-identifications. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7134–7143

  42. Fang P, Zhou J, Roy SK, Petersson L, Harandi M (2019) Bilinear attention networks for person retrieval. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8030–8039

  43. Zheng M, Karanam S, Wu Z, Radke RJ (2019) Re-identification with consistent attentive siamese networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5735–5744

  44. Jin X, Lan C, Zeng W, Chen Z, Zhang L (2020) Style normalization and restitution for generalizable person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3143–3152

  45. Zhang Z, Lan C, Zeng W, Jin X, Chen Z (2020) Relation-aware global attention for person re-identification. In: Proceedings of the Ieee/cvf conference on computer vision and pattern recognition, pp 3186–3195

  46. Li H, Wu G, Zheng W-S (2021) Combined depth space based architecture search for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6729–6738

  47. Wu G, Zhu X, Gong S (2022) Learning hybrid ranking representation for person re-identification. Pattern Recogn 121:108239

    Article  Google Scholar 

  48. Ming Z, Yang Y, Wei X, Yan J, Wang X, Wang F, Zhu M (2021) Global-local dynamic feature alignment network for person re-identification. arXiv:2109.05759

  49. Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. Journal of machine learning research 9(11)

  50. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 62171318.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuebo Zheng.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, M., Hou, C., Zheng, X. et al. Multi-resolution feature perception network for UAV person re-identification. Multimed Tools Appl 83, 62559–62580 (2024). https://doi.org/10.1007/s11042-023-17937-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17937-8

Keywords

Navigation