Abstract
A large portion of the global population generates various multimedia data such as texts, images, videos, etc. One of the most common categories which influences the public at large is visual multimedia content. Due to the different social media platforms (e.g., Whatsapp, Twitter, Facebook, Instagram, and YouTube), these materials are passed without censorship and national boundaries. Multimedia data containing any violent or vulgar objects could trigger public unrest, and thus, it is a serious threat to the law and order of the land. Children and teenagers use social media like never before in previous generations and create lots of multimedia data. It is important to assess the quality of multimedia content without any bias and prejudices. Although the mainstream social media platforms use different filters and moderation using human experts, it is impossible to verify the terabytes of uploaded images and videos. Thus, it is inevitable to automate the content assessment phase without incurring an increase in upload time. This study aims to prevent uploading or to tag an image/video with a reasonable percentage of a gun as content. In this paper, object detection architectures such as Faster RCNN, EfficientDet, and YOLOv5 have been used to demonstrate how these techniques can efficiently detect human faces and different types of guns in given multimedia data (images/videos). The models are tested on various test images and video clips. A comparative analysis has also been discussed based on mean average precision and frames per second metric. The YOLOv5 provides the best-performing results as high as 80.39% and 35.22% at \(\text{mAP}_{0.5}\) and \(\text{mAP}_{[0.50:0.95]}\), respectively. A face recognition task requires thousands of samples and the usual deep learning models are data-driven. On the contrary, a few-shot learning approach has been implemented to recognize the detected faces categorizing the content as real or reel.









Similar content being viewed by others
Notes
YOLO:Real-Time Object Detectionhttps://pjreddie.com/darknet/yolo/.
EfficientDet: https://github.com/xuannianz/EfficientDet.
YOLOv5s: https://github.com/ultralytics/yolov5.
Collateral (2004): https://www.youtube.com/watch?v=EMS4lYA-hEo.
References
Adorjan, M., Ricciardelli, R.: Smartphone and social media addiction: exploring the perceptions and experiences of Canadian teenagers. Can. Rev. Sociol./Revue canadienne de sociologie 58(1), 45–64 (2021)
Van den Eijnden, R.J.J.M., Lemmens, J.S., Valkenburg, P.M.: The social media disorder scale. Comput. Hum. Behav. 61, 478–487 (2016)
Fabris, M.A., Marengo, D., Longobardi, C., Settanni, M.: Investigating the links between fear of missing out, social media addiction, and emotional symptoms in adolescence: the role of stress associated with neglect and negative reactions on social media. Addict. Behav. 106, 106364 (2020)
Jaffe, S.: Decisions to be made on us gun violence research funds. Lancet 395(10222), 403–404 (2020)
Smith, M.E., Sharpe, T.L., Richardson, J., Pahwa, R., Smith, D., DeVylder, J.: The impact of exposure to gun violence fatality on mental health outcomes in four urban us settings. Soc. Sci. Med. 246, 112587 (2020)
Two Delhi teens upload photos with guns on social media, land in police net (2019). https://www.hindustantimes.com/delhi-news/two-delhi-teens-upload-photos-with-guns-on-social-media-land-in-police-net/story-RoB0IZweeGGqbaQ1OyAbbK.html. Accessed 1 Sept 2020
Delhi police nabs man for brandishing gun, posting picture on whatsapp (2020). https://www.indiatoday.in/crime/story/delhi-police-nabs-man-brandishing-gun-posting-picture-whatsapp-1655753-2020-03-15. Accessed 1 Sept 2020
Posting pics with licenced guns on social media can land you (2017). https://timesofindia.indiatimes.com/city/hubballi/posting-pics-with-licenced-guns-on-social-media-can-land-you-in-jail/articleshow/61512798.cms. Accessed 1 Sept 2020
Patton, D.U., Frey, W.R., Gaskell, M.: Guns on social media: complex interpretations of gun images posted by Chicago youth. Palgrave Commun. 5(1), 1–8 (2019)
The hired guns of Instagram (2019). https://www.vox.com/features/2019/6/19/18644129/instagram-gun-influencers-second-amendment-tactical-community. Accessed 1 Sept 2020
Liu, L., Dzyabura, D., Mizik, N.: Visual listening in: extracting brand image portrayed on social media. Mark. Sci. 39(4), 669–686 (2020)
Zhang, Zhenhua, He, Qing, Gao, Jing, Ni, Ming: A deep learning approach for detecting traffic accidents from social media data. Transp. Res. Part C Emerg. Technol. 86, 580–596 (2018)
Nguyen, D.T., Alam, F., Ofli, F., Imran, M.: Automatic image filtering on social networks using deep learning and perceptual hashing during crises (2017). arXiv preprint arXiv:1704.02602
Garimella, V.R.K., Alfayad, A., Weber, I.: Social media image analysis for public health. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 5543–5547 (2016)
Egiazarov, A., Mavroeidis, V., Zennaro, F.M., Kamer, V.: Firearm detection and segmentation using an ensemble of semantic neural networks. In: 2019 European Intelligence and Security Informatics Conference (EISIC), pp. 70–77. IEEE (2019)
Akçay, S., Kundegorski, M.E., Devereux, M., Breckon, T.P. : Transfer learning using convolutional neural networks for object classification within x-ray baggage security imagery. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 1057–1061. IEEE (2016)
Olmos, R., Tabik, S., Herrera, F.: Automatic handgun detection alarm in videos using deep learning. Neurocomputing 275, 66–72 (2018)
Halder, R., Chatterjee, R.: CNN-BiLSTM model for violence detection in smart surveillance. SN Comput. Sci. 1(4), 1–9 (2020)
Yolov5 (2020). https://zenodo.org/record/3983579#.X1EIAsgzY2w. Accessed 25 Aug 2020
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Region proposal network (2017). https://blog.deepsense.ai/region-of-interest-pooling-explained/. Accessed 20 May 2020
Wu, X., Sahoo, D., Zhang, D., Zhu, J., Hoi, S.C.H.: Single-shot bidirectional pyramid networks for high-quality object detection. Neurocomputing (2020)
Efficientnet (2019). https://keras.io/api/applications/efficientnet/. Accessed 20 May 2020
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Padilla, R., Netto, S.L., da Silva, E.A.B.: Survey on performance metrics for object-detection algorithms. In: International Conference on Systems, Signals and Image Processing (IWSSIP) (2020)
Paul, H., Ferrari, V.: End-to-end training of object class detectors for mean average precision. In: Asian Conference on Computer Vision, pp. 198–213. Springer, Berlin (2016)
Revaud, J., Almazán, J., Rezende, R.S., de Souza, C.R.: Learning with average precision: training image retrieval with a listwise loss. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5107–5116 (2019)
Average precision (2020). https://github.com/rafaelpadilla/Object-Detection-Metrics. Accessed 25 July 2020
Rezatofighi, H., Tsoi, N., Gwak, J.Y., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union (2019)
Rezatofighi, S.H., Tsoi, N., Gwak, J.Y., Sadeghian, A., Reid, I.D., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression (2019). CoRR. arXiv:1902.09630
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: A dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2018), pp. 67–74. IEEE (2018)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv preprint. arXiv:1804.02767
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: Optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934
Wang, C.-Y., Liao, H.-Y.M., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W., Yeh, I.H.: CSPNeT: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)
Yang, J., Fu, X., Hu, Y., Huang, Y., Ding, X., Paisley, J.: PanNet: a deep network architecture for pan-sharpening. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5449–5457 (2017)
Internet movie firearms database (2008). http://www.imfdb.org/wiki/Main_Page. Accessed 17 May 2020
Wider face dataset (2017). http://shuoyang1213.me/WIDERFACE/. Accessed 20 May 2020
Acknowledgements
The work of Dr. Muhammad Khurram Khan is supported by Researchers Supporting Project number (RSP-2021/12), King Saud University, Riyadh, Saudi Arabia.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chatterjee, R., Chatterjee, A., Islam, S. et al. An object detection-based few-shot learning approach for multimedia quality assessment. Multimedia Systems 29, 2899–2912 (2023). https://doi.org/10.1007/s00530-021-00881-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-021-00881-8