EfficientFace: an efficient deep network with feature enhancement for accurate face detection

Wang, Guangtao; Li, Jun; Wu, Zhijian; Xu, Jianhua; Shen, Jifeng; Yang, Wankou

doi:10.1007/s00530-023-01134-6

EfficientFace: an efficient deep network with feature enhancement for accurate face detection

Regular Paper
Published: 14 July 2023

Volume 29, pages 2825–2839, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Guangtao Wang¹,
Jun Li¹,
Zhijian Wu²,
Jianhua Xu¹,
Jifeng Shen³ &
…
Wankou Yang⁴

332 Accesses
Explore all metrics

Abstract

In recent years, deep convolutional neural networks (CNN) have significantly advanced face detection. In particular, lightweight CNN-based architectures have achieved great success due to their low-complexity structure facilitating real-time detection tasks. However, current lightweight CNN-based face detectors trading accuracy for efficiency have inadequate capability in handling insufficient feature representation, faces with unbalanced aspect ratios and occlusion. Consequently, they exhibit deteriorated performance far lagging behind the deep heavy detectors. To achieve efficient face detection without sacrificing accuracy, we design an efficient deep face detector termed EfficientFace in this study, which contains three modules for feature enhancement. To begin with, we design a novel cross-scale feature fusion strategy to facilitate bottom-up information propagation, such that fusing low-level and high-level features is further strengthened. Besides, this is conducive to estimating the locations of faces and enhancing the descriptive power of face features. Second, we introduce a Receptive Field Enhancement module to consider faces with various aspect ratios. Third, we add an Attention Mechanism module for improving the representational capability of occluded faces. We have evaluated EfficientFace on four public benchmarks and experimental results demonstrate the appealing performance of our method. In particular, our model respectively achieves 95.1% (Easy), 94.0% (Medium) and 90.1% (Hard) on a validation set of WIDER Face dataset, which is competitive with heavyweight models with only 1/15 computational costs of the state-of-the-art MogFace detector.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

HPFace: a high speed and accuracy face detector

Article 24 September 2022

Face Detection with Better Representation Using a Multi-region WR-Inception Network Model

LiteFace: A Light-Weight Multi-person Face Detection Model

Data availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)
Article Google Scholar
Liu, Y., Wang, F., Sun, B., Li, H.: Mogface: Rethinking scale augmentation on the face detector. arXiv preprint arXiv:2103.11139 (2021). https://github.com/damo-cv/MogFace
Zhang, F., Fan, X., Ai, G., Song, J., Qin, Y., Wu, J.: Accurate face detection for high performance. arXiv preprint arXiv:1905.01585 (2019)
Li, J., Wang, Y., Wang, C., Tai, Y., Qian, J., Yang, J., Wang, C., Li, J., Huang, F.: Dsfd: dual shot face detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5060–5069 (2019)
Yoo, Y., Han, D., Yun, S.: Extd: Extremely tiny face detector via iterative filter reuse. arXiv preprint arXiv:1906.06579 (2019)
Qi, D., Tan, W., Yao, Q., Liu, J.: Yolo5face: Why reinventing a face detector. In: Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part V, pp. 228–244 (2023). Springer. https://github.com/deepcam-cn/yolov5-face
He, Y., Xu, D., Wu, L., Jian, M., Xiang, S., Pan, C.: Lffd: A light and fast face detector for edge devices. arXiv preprint arXiv:1904.10633 (2019)
Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114 (2019). PMLR
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Faster, R.: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Proc. Syst. 9199(10.5555), 2969239–2969250 (2015)
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016). Springer
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6569–6578 (2019)
Vesdapunt, N., Wang, B.: Crface: Confidence ranker for model-agnostic face detection refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1674–1684 (2021)
Zhang, C., Xu, X., Tu, D.: Face detection using improved faster rcnn. arXiv preprint arXiv:1802.02142 (2018)
Zhang, S., Zhu, R., Wang, X., Shi, H., Fu, T., Wang, S., Mei, T., Li, S.Z.: Improved selective refinement network for face detection. arXiv preprint arXiv:1901.06651 (2019)
Zhang, Y., Xu, X., Liu, X.: Robust and high performance face detector. arXiv preprint arXiv:1901.02350 (2019)
Zhu, Y., Cai, H., Zhang, S., Wang, C., Xiong, Y.: Tinaface: Strong but simple baseline for face detection. arXiv preprint arXiv:2011.13183 (2020)
Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: S3fd: Single shot scale-invariant face detector. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 192–201 (2017)
Wang, J., Yuan, Y., Yu, G.: Face attention network: An effective face detector for the occluded faces. arXiv preprint arXiv:1711.07246 (2017)
Najibi, M., Samangouei, P., Chellappa, R., Davis, L.S.: Ssh: Single stage headless face detector. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4875–4884 (2017)
Tang, X., Du, D.K., He, Z., Liu, J.: Pyramidbox: A context-assisted single shot face detector. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 797–813 (2018)
Ming, X., Wei, F., Zhang, T., Chen, D., Wen, F.: Group sampling for scale invariant face detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3446–3456 (2019)
Liu, Y., Tang, X., Han, J., Liu, J., Rui, D., Wu, X.: Hambox: Delving into mining high-quality anchors on face detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13043–13051 (2020). IEEE
Zhang, B., Li, J., Wang, Y., Tai, Y., Wang, C., Li, J., Huang, F., Xia, Y., Pei, W., Ji, R.: Asfd: Automatic and scalable face detector. arXiv preprint arXiv:2003.11228 (2020)
Yolov5. https://github.com/ultralytics/yolov5 (2020)
Tan, M., Pang, R., Le, Q.V.: Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., Ling, H.: M2det: A single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 9259–9266 (2019)
Chiasi, G., Lin, T.-Y., Le QV, N.: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE Computer Vision and Pattern Recognition, pp. 7029–7038
Cao, J., Chen, Q., Guo, J., Shi, R.: Attention-guided context feature pyramid network for object detection. arXiv preprint arXiv:2005.11475 (2020)
Wang, J., Chen, Y., Gao, M., Dong, Z.: Improved yolov5 network for real-time multi-scale traffic sign detection. arXiv preprint arXiv:2112.08782 (2021)
Qiao, S., Chen, L.-C., Yuille, A.: Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10213–10224 (2021)
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: More features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)
Yang, S., Luo, P., Loy, C.-C., Tang, X.: Wider face: A face detection benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)
Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z., Zou, X.: Selective refinement network for high performance face detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8231–8238 (2019). https://github.com/ChiCheng123/SRN
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012). IEEE
Yan, J., Zhang, X., Lei, Z., Li, S.Z.: Face detection by structural models. Image Vis. Comput. 32(10), 790–799 (2014)
Article Google Scholar
Jain, V., Learned-Miller, E.: Fddb: A benchmark for face detection in unconstrained settings. Technical report, UMass Amherst technical report (2010)
Google Scholar
Zitnick, C.L., Dollár, P.: Edge boxes: Locating object proposals from edges. In: European Conference on Computer Vision, pp. 391–405 (2014). Springer
Zhang, S., Chi, C., Lei, Z., Li, S.Z.: Refineface: Refinement neural network for high performance face detection. IEEE Trans. Pattern Anal. Mach. Intell. 43(11), 4008–4020 (2020)
Article Google Scholar
Najibi, M., Singh, B., Davis, L.S.: Fa-rpn: Floating region proposals for face detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7723–7732 (2019)
Zhang, S., Wen, L., Shi, H., Lei, Z., Lyu, S., Li, S.Z.: Single-shot scale-aware network for real-time face detection. Int. J. Comput. Vis. 127(6), 537–559 (2019)
Article Google Scholar
Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: Faceboxes: A cpu real-time face detector with high accuracy. In: 2017 IEEE International Joint Conference on Biometrics (IJCB), pp. 1–9 (2017). IEEE
Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 121–135 (2017)
Article Google Scholar
Chen, D., Hua, G., Wen, F., Sun, J.: Supervised transformer network for efficient face detection. In: European Conference on Computer Vision, pp. 122–138 (2016). Springer

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant No. 62173186, 62076134, 62276061 and in part by NSF of China under Grant No. 61903164, NSF of Jiangsu Province in China under Grants BK20191427.

Author information

Authors and Affiliations

School of Computer and Electronic Information, Nanjing Normal University, Nanjing, 210023, China
Guangtao Wang, Jun Li & Jianhua Xu
School of Data Science and Engineering, East China Normal University, Shanghai, 200062, China
Zhijian Wu
School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, 212013, China
Jifeng Shen
School of Automation, Southeast University, Nanjing, 210096, China
Wankou Yang

Authors

Guangtao Wang
View author publications
You can also search for this author inPubMed Google Scholar
Jun Li
View author publications
You can also search for this author inPubMed Google Scholar
Zhijian Wu
View author publications
You can also search for this author inPubMed Google Scholar
Jianhua Xu
View author publications
You can also search for this author inPubMed Google Scholar
Jifeng Shen
View author publications
You can also search for this author inPubMed Google Scholar
Wankou Yang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jun Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by R. Huang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, G., Li, J., Wu, Z. et al. EfficientFace: an efficient deep network with feature enhancement for accurate face detection. Multimedia Systems 29, 2825–2839 (2023). https://doi.org/10.1007/s00530-023-01134-6

Download citation

Received: 04 November 2022
Accepted: 04 July 2023
Published: 14 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00530-023-01134-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

EfficientFace: an efficient deep network with feature enhancement for accurate face detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HPFace: a high speed and accuracy face detector

Face Detection with Better Representation Using a Multi-region WR-Inception Network Model

LiteFace: A Light-Weight Multi-person Face Detection Model

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now