Abstract
Generative adversarial network is widely used in person re-identification to expand data by generating auxiliary data. However, researchers all believe that using too much generated data in the training phase will reduce the accuracy of re-identification models. In this study, an improved generator and a constrained two-stage fusion network are proposed. A novel gesture discriminator embedded into the generator is used to calculate the completeness of skeleton pose images. The improved generator can make generated images more realistic, which would be conducive to feature extraction. The role of the constrained two-stage fusion network is to extract and utilize the real information of the generated images for person re-identification. Unlike previous studies, the fusion of shallow features is considered in this work. In detail, the proposed network has two branches based on the structure of ResNet50. One branch is for the fusion of images that are generated by the generated adversarial network, the other is applied to fuse the result of the first fusion and the original image. Experimental results show that our method outperforms most existing similar methods on Market-1501 and DukeMTMC-reID.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Gheissari N, Sebastian TB, Hartley R (2006) Person reidentification using spatiotemporal appearance. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1528–1535
Liu J, Sun C, Xu X, et al. (2019) A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl Intell 49(9):3436–3446
Gong S, Cristani M, Shuicheng Y, Loy CC, et al. (2014) Person Re-identification. Springer, London, pp 1–20
Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: Past, present and future. arXiv:1610.02984
Saquib Sarfraz M, Schumann A, Eberle A et al (2018) A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 420–429
Huang Y, Zha ZJ, Fu X et al (2019) Illumination-invariant person re-identification
Hou R, Ma B, Chang H et al (2019) Vrstc: Occlusion-free video person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7183–7192
Wang Y, Wang L, You Y et al (2018) Resource aware person re-identification across multiple resolutions
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS), pp 1097–1105
Goodfellow I, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial nets. In: Advances in neural information processing systems (NIPS), pp 2672–2680
Guo W, Cai J, Wang S (2020) Unsupervised discriminative feature representation via adversarial auto-encoder. Appl Intell 50(4):1155–1171
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: IEEE international conference on computer vision (ICCV), pp 3754–3762
Zhong Z, Zheng L, Zheng Z et al (2018) Camera style adaptation for person re-identification. In: IEEE international conference on computer vision (ICCV), pp 5157–5166
Bak S, Carr P, Lalonde JF (2018) Domain adaptation through synthesis for unsupervised person re-identification. In: European conference on computer vision (ECCV), pp 189–205
Wei L, Zhang S, Gao W et al (2018) Person transfer gan to bridge domain gap for person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 79–88
Liu J, Zhou Y, Sun L et al (2019) Similarity preserved camera-to-camera GAN for person re-identification. In: IEEE International conference on multimedia (&) expo workshops (ICMEW), pp 531–536
Liu J, Ni B, Yan Y et al (2018) Pose transferrable person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 4099–4108
Qian X, Fu Y, Xiang T et al (2018) Pose-normalized image generation for person re-identification. In: European conference on computer vision (ECCV), pp 650–667
Siarohin A, Sangineto E, Lathuiliere S et al (2018) Deformable gans for pose-based human image generation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3408–3416
Ho HI, Shim M, Wee D (2020) Learning from dances: pose-invariant re-identification for multi-person tracking. In: IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 2113–2117
Ge Y, Li Z, Zhao H et al (2018) Fd-gan: Pose-guided feature distilling gan for robust person re-identification. In: Advances in neural information processing systems (NIPS), pp 1222– 1233
Huang L, Yang Q, Wu J, et al. (2020) Generated data with sparse regularized multi-pseudo label for person re-identification. IEEE Signal Process Lett 27:391–395
Qian F, Li J, Du X, et al. (2020) Generative image inpainting for link prediction. Appl Intell 50:1–13
Xiong X, Min W, Zheng W S, et al. (2020) S3D-CNN: skeleton-based 3D consecutive-low-pooling neural network for fall detection. Appl Intell 50:1–14
Zheng L, Shen L, Tian L et al (2015) Scalable person re-identification: A benchmark. In: IEEE international conference on computer vision (ICCV), pp 1116–1124
Ristani E, Solera F, Zou R et al (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision (ECCV), pp 17–35
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Cao Z, Simon T, Wei SE et al (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7291– 7299
Dong H, Liang X, Gong K et al (2018) Soft-gated warping-gan for pose-guided person image synthesis. In: Advances in neural information processing systems (NIPS), pp 474– 484
Yu K, Lang C, Feng S et al (2018) Reasonably assign label distributions to GAN images in Person Re-Identification baseline. In: IEEE Fourth international conference on multimedia big data (BigMM), pp 1–5
Huang Y, Xu J, Wu Q, et al. (2018) Multi-pseudo regularized label for generated data in person re-identification. IEEE Trans Image Process 28(3):1391–1403
Salimans T, Goodfellow I, Zaremba W et al (2016) Improved techniques for training gans. In: Advances in neural information processing systems (NIPS), pp 2234–2242
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7132–7141
Wen Y, Zhang K, Li Z, et al. (2016) A discriminative feature learning approach for deep face recognition. In: European conference on computer vision (ECCV)
Cheng D, Gong Y, Zhou S, et al. (2016) Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: IEEE conference on computer vision and pattern recognition (CVPR)
Chen W, Chen X, Zhang J, et al. (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR)
Deng J, Dong W, Socher R et al (2009) Imagenet: A large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255
Paszke A, Gross S, Chintala S et al (2017) Automatic differentiation in pytorch. In NIPS-W
Heusel M, Ramsauer H, Unterthiner T et al (2017) GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Advances in neural information processing systems (NIPS), pp 6626–6637
Wang Z, Bovik A C, Sheikh H R, et al. (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Salimans T, Goodfellow I, Zaremba W, et al. (2016) Improved techniques for training gans. In: Advances in neural information processing systems (NIPS)
Isola P, Zhu JY, Zhou T et al (2017) Image-to-image translation with conditional adversarial networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1125–1134
Ma L, Sun Q, Georgoulis S et al (2018) Disentangled person image generation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 99–108
Xudong M, Qing L, Haoran X, Raymond L, Zhen W, Stephen S et al (2017) Least squares generative adversarial networks. In: IEEE international conference on computer vision (ICCV), pp 2794–2802
Ma L, Jia X, Sun Q et al (2017) Pose guided person image generation. In: Advances in neural information processing systems (NIPS), pp 406–416
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, T., Sun, X., Li, X. et al. Image generation and constrained two-stage feature fusion for person re-identification. Appl Intell 51, 7679–7689 (2021). https://doi.org/10.1007/s10489-021-02271-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02271-z