Abstract
Matrix-NMS is derived from the SOLO network, an instance segmentation network. Matrix-NMS was transplanted to the target detection network YOLO to improve the detection accuracy and recall rate of the actual application scenario of “similar objects closed to each other or have a certain occlusion relationship”. The model used in this article is based on YOLOv3.ResNet18 and ResNet50 were studied as pre-trained networks to compare the accuracy and recall rate of using Matrix-NMS or not. The results show that the model using Matrix-NMS not only has a certain degree of improvement in accuracy and recall rate, but also has the similar inference time. The accuracy and recall rate of the model have been improved to a certain extent. The study in this article could be applied to face detection, sign detection within complex road conditions for autonomous driving or for focus detection in medical imaging. In addition to Matrix-NMS, the convolution layer with DropBlock, the pre-trained networks ResNet50, and ResNet18 with Deformable Convolutional Networks have also been studied.
Supported by Baidu ADT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, S.Z., Zhu, L., Zhang, Z.Q., Blake, A., Zhang, H.J., Shum, H.: Statistical learning of multi-view face detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 67–81. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-47979-1_5
Hu, P., Ramanan, D.: Finding tiny faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–959 (2017)
Najibi, M., Samangouei, P., Chellappa, R., Davis, L.S.: SSH: single stage headless face detector. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4875–4884 (2017)
Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS-improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5561–5569 (2017)
Liu, S., Huang, D., Wang,Y.: Adaptive NMS: refining pedestrian detection in a crowd. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6459–6468 (2019)
Bolya, D., Zhou, C., Xiao, F., Jae Lee, Y.: YOLACT: real-time instance segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9157–9166 (2019)
Wang, X., Zhang, R., Kong, T., Li, L., Shen, C.: SOLOv2: dynamic, faster and stronger. arXiv preprint arXiv:2003.10152 (2020)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Ghiasi, G., Lin, T.-Y., Le, Q.V.: DropBlock: a regularization method for convolutional networks. arXiv preprint arXiv:1810.12890 (2018)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)
Dai, J., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, X., Peng, Y., Yang, J. (2021). Application of Matrix-NMS in Face Detection and Autonomous Driving. In: Liu, Z., Wu, F., Das, S.K. (eds) Wireless Algorithms, Systems, and Applications. WASA 2021. Lecture Notes in Computer Science(), vol 12939. Springer, Cham. https://doi.org/10.1007/978-3-030-86137-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-86137-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86136-0
Online ISBN: 978-3-030-86137-7
eBook Packages: Computer ScienceComputer Science (R0)