Deep transformation learning for face recognition in the unconstrained scene | Machine Vision and Applications Skip to main content
Log in

Deep transformation learning for face recognition in the unconstrained scene

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Because human pose variations cannot be controlled in unconstrained scene, it is frequently hard to capture frontal face image. This is why either face recognition rate is low, or face image cannot be recognized at all. To tackle the problem, this paper proposes deep transformation learning to extract the pose-robust feature within one model; it includes feature transformation and joint supervision of softmax loss and pose loss. Specifically, the feature transformation is designed to learn the transformation from different poses. The pose loss is designed to simultaneously learn the feature center of different poses and keep intra-pose relationships. The extracted deep features tend to be more pose-robust and discriminative. Experimental results also confirm the performances to be valid on several important face recognition benchmarks, including Labeled Faces in the Wild, IARPA Janus Benchmark A.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Abiantun, R., Prabhu, U., Savvides, M.: Sparse feature extraction for pose-tolerant face recognition. IEEE Trans. Pattern. Anal. Mach. Intell. 36(10), 2061–2073 (2014)

    Article  Google Scholar 

  2. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)

  3. Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1891–1898 (2014)

  4. Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in Neural Information Processing Systems, pp. 1988–1996 (2014)

  5. Sun, Y., Liang, D., Wang, X., Tang, X.: Deepid3: face recognition with very deep neural networks. arXiv preprint arXiv:1502.00873 (2015)

  6. Masi, I., Rawls, S., Medioni, G., Natarajan, P.: Pose-aware face recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4838–4846 (2016)

  7. Chen, Y.-C., Patel, V.M., Phillips, P.J., Chellappa, R.: Dictionary-based face recognition from video. In: European Conference on Computer Vision, pp. 766–779. Springer (2012)

  8. Asthana, A., Marks, T.K., Jones, M.J., Tieu, K.H., Rohith, M.: Fully automatic pose-invariant face recognition via 3D pose normalization. In: 2011 International Conference on Computer Vision, pp. 937–944. IEEE (2011)

  9. Taigman, Y., Yang, M., Ranzato, M.A., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)

  10. King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10(Jul), 1755–1758 (2009)

    Google Scholar 

  11. Ding, C., Tao, D.: A comprehensive survey on pose-invariant face recognition. ACM Trans. Intell. Syst. Technol. (TIST) 7(3), 37 (2016)

    Google Scholar 

  12. Ding, C., Xu, C., Tao, D.: Multi-task pose-invariant face recognition. IEEE Trans. Image Process. 24(3), 980–993 (2015)

    Article  MathSciNet  Google Scholar 

  13. Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and robust. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2892–2900 (2015)

  14. Pal, D.K., Xu, J.-F., Savvides, M.: Discriminative invariant kernel features: a bells-and-whistles-free approach to unsupervised face recognition and pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5590–5599 (2016)

  15. Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: NIPS 2015: Proceedings of the 2015 Conference, Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)

  16. Gao, J.-R., Fu, Y.-W., Jiang, Y.-G., Xue, X.-Y.: Frame-transformer emotion classification network. In: ICMR 2017: ACM International Conference on Multimedia Retrieval (ICMR), Bucharest, Romania (2017)

  17. Rudd, E.M., Günther, M., Boult, T.E.: MOON: a mixed objective optimization network for the recognition of facial attributes. In: European Conference on Computer Vision 2016, Part V, LNCS 9909, pp. 19–35. Amsterdam, The Netherlands (2016)

  18. Wang, Z.-X., He, K.-K., Fu, Y.-W., Feng, R., Jiang,Y.-G., Xue, X.-Y.: Multi-task deep neural network for joint face recognition and facial attribute prediction. In: ACM International Conference on Multimedia Retrieval (ICMR), Bucharest, Romania (2017)

  19. Parameswaran, S., Weinberger, K.Q.: Large margin multi-task metric learning. In: Advances in Neural Information Processing Systems, pp. 1867–1875 (2010)

  20. Chen, D., Cao, X., Wang, L., Wen, F., Sun, J.: Bayesian face revisited: a joint formulation. In: European Conference on Computer Vision, pp. 566–579. Springer (2012)

  21. Bromley, J., Bentz, J.W., Bottou, L., Guyon, I., Lecun, Y., Moore, C., Sackinger, E., Shah, R.: Signature verification using a “SIAMESE” time delay neural network. In: NIPS Proc, pp. 669–688 (1993)

  22. Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 4353–4361 (2015)

  23. Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016)

  24. Ng, H.-W., Winkler, S.: A data-driven approach to cleaning large face datasets. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 343–347. IEEE (2014)

  25. Chen, B.-C., Chen, C.-S., Hsu, W.-H.: Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Trans. Multimed. 17(6), 804–815 (2015)

    Article  Google Scholar 

  26. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi-task cascaded convolutional networks. arXiv preprint arXiv:1604.02878 (2016)

  27. Huber, P., Hu, G., Tena, R., Mortazavian, P., Koppen, W.P., Christmas, W., Rätsch, M., Kittler, J.: A multiresolution 3D morphable face model and fitting framework. In: Proceedings of the 11th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (2016)

  28. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  29. Wang, D., Otto, C., Jain, A.K.: Face search at scale. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2016)

  30. Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference, vol. 3, p. 6 (2015)

  31. Klare, B.F., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., Grother, P., Mah, A., Burge, M., Jain, A.K.: Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1931–1939. IEEE (2015)

  32. Klontz, J.C., Klare, B.F., Klum, S., Jain, A.K., Burge, M.J.: Open source biometric recognition. In: IEEE International Conference on Biometrics: Theory, Applications and Systems, pp. 1–8 (2013)

  33. Chen, J.C., Patel, V.M., Chellappa, R.: Unconstrained face verification using deep CNN features. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1–9 (2016)

  34. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

Download references

Acknowledgements

The work reported in this paper is supported by a research Grant from Chongqing Science & Technology Commission (Project Code: cstc2016shmszx0500) and a research Grant from Scientific and Technological Research Program of Chongqing Municipal Education Commission (Project Code: KJ1729405) and a research Grant from Foshan Economic and Information Bureau and a research Grant from National Natural Science Foundation of China (Project Code: 61675036, 61771080).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chaowei Tang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, G., Shao, Y., Tang, C. et al. Deep transformation learning for face recognition in the unconstrained scene. Machine Vision and Applications 29, 513–523 (2018). https://doi.org/10.1007/s00138-018-0907-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-018-0907-1

Keywords

Navigation