{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T18:05:22Z","timestamp":1732039522027},"reference-count":40,"publisher":"MDPI AG","issue":"16","license":[{"start":{"date-parts":[[2020,8,15]],"date-time":"2020-08-15T00:00:00Z","timestamp":1597449600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) deep neural networks for the outdoor advertisement panel detection problem by handling multiple and combined variabilities in the scenes. Publicity panel detection in images offers important advantages both in the real world as well as in the virtual one. For example, applications like Google Street View can be used for Internet publicity and when detecting these ads panels in images, it could be possible to replace the publicity appearing inside the panels by another from a funding company. In our experiments, both SSD and YOLO detectors have produced acceptable results under variable sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex background and multiple panels in scenes. Due to the difficulty of finding annotated images for the considered problem, we created our own dataset for conducting the experiments. The major strength of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable when the publicity contained inside the panel is analyzed after detecting them. On the other side, YOLO produced better panel localization results detecting a higher number of True Positive (TP) panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models with different types of semantic segmentation networks and using the same evaluation metrics is also included.<\/jats:p>","DOI":"10.3390\/s20164587","type":"journal-article","created":{"date-parts":[[2020,8,17]],"date-time":"2020-08-17T08:35:51Z","timestamp":1597653351000},"page":"4587","source":"Crossref","is-referenced-by-count":53,"title":["SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities"],"prefix":"10.3390","volume":"20","author":[{"given":"\u00c1ngel","family":"Morera","sequence":"first","affiliation":[{"name":"Technical School of Computer Science, Rey Juan Carlos University, 28933 M\u00f3stoles, Madrid, Spain"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-9069-6985","authenticated-orcid":false,"given":"\u00c1ngel","family":"S\u00e1nchez","sequence":"additional","affiliation":[{"name":"Technical School of Computer Science, Rey Juan Carlos University, 28933 M\u00f3stoles, Madrid, Spain"}]},{"given":"A. Bel\u00e9n","family":"Moreno","sequence":"additional","affiliation":[{"name":"Technical School of Computer Science, Rey Juan Carlos University, 28933 M\u00f3stoles, Madrid, Spain"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-2468-0031","authenticated-orcid":false,"given":"\u00c1ngel D.","family":"Sappa","sequence":"additional","affiliation":[{"name":"Escuela Superior Polit\u00e9cnica del Litoral, ESPOL, Guayaquil 090101, Ecuador"},{"name":"Computer Vision Center, Bellaterra, 08193 Barcelona, Spain"}]},{"given":"Jos\u00e9 F.","family":"V\u00e9lez","sequence":"additional","affiliation":[{"name":"Technical School of Computer Science, Rey Juan Carlos University, 28933 M\u00f3stoles, Madrid, Spain"}]}],"member":"1968","published-online":{"date-parts":[[2020,8,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Anthopoulos, L. (2017). Understanding Smart Cities: A Tool for Smart Government or an Industrial Trick?, Springer.","DOI":"10.1007\/978-3-319-57015-0"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1016\/j.cities.2019.04.014","article-title":"Smart city and information technology: A review","volume":"93","author":"Camero","year":"2019","journal-title":"Cities"},{"key":"ref_3","unstructured":"Smartcity Press (2020, April 15). The Face of Digital Ads in Smart Cities. December 2018. Available online: https:\/\/www.smartcity.press\/smart-cities-digital-advertisements\/."},{"key":"ref_4","unstructured":"Borisova, O., and Martynova, A. (2017). Comparing the Effectiveness of Outdoor Advertising with Internet Advertising. [Bachelor\u2019s Thesis, JAMK University of Applied Sciences]."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Huang, Y., Hao, Q., and Yu, H. (2011, January 11\u201314). Virtual ads insertion in street building views for augmented reality. Proceedings of the 18th IEEE International Conference on Image Processing, Brussels, Belgium.","DOI":"10.1109\/ICIP.2011.6115623"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Wong, D., Deguchi, D., Ide, I., and Murase, H. (2014, January 6\u201312). Vision-based vehicle localization using a visual street map with embedded SURF scale. Proceedings of the European Conference on Computer Vision (ECCV \u201914), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-16178-5_11"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Cao, J., Song, C., Peng, S., Xiao, F., and Song, S. (2019). Improved traffic sign detection and recognition algorithm for intelligent vehicles. Sensors, 19.","DOI":"10.3390\/s19184021"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1016\/j.procs.2016.03.054","article-title":"License plate detection using harris corner and character segmentation by integrated approach from an image","volume":"79","author":"Panchal","year":"2016","journal-title":"Procedia Comput. Sci."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Salamanca, S., Merch\u00e1n, P., and Garc\u00eda, I. (2017, January 3\u20136). On the detection of solar panels by image processing techniques. Proceedings of the 25th Mediterranean Conference on Control and Automation (MED\u201917), Valletta, Malta.","DOI":"10.1109\/MED.2017.7984163"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Intasuwan, T., Kaewthong, J., and Vittayakorn, S. (2018, January 7\u20139). Text and object detection on billboards. Proceedings of the International Conference on Information Technology and Electrical Engineering (ICITEE 2018), Kuta, Indonesia.","DOI":"10.1109\/ICITEED.2018.8534879"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"994","DOI":"10.1016\/j.patrec.2008.01.022","article-title":"Soccer video processing for the detection of advertisement billboards","volume":"29","author":"Watve","year":"2008","journal-title":"Pattern Recognit. Lett."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Hussain, Z., Zhang, M., Zhang, X., Ye, K., Thomas, C., Agha, Z., Ong, N., and Kovashka, A. (2017, January 21\u201326). Automatic understanding of image and video advertisements. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Recognition (CVPR\u201917), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.123"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1007\/s10032-019-00320-5","article-title":"Scene text detection and recognition with advances in deep learning: A survey","volume":"22","author":"Liu","year":"2019","journal-title":"Int. J. Doc. Anal. Recognit."},{"key":"ref_14","unstructured":"ICDAR 2019 Conference (2020, July 22). ICDAR 2019 Robust Reading Challenge on Multi-Lingual Scene Text Detection and Recognition. Available online: https:\/\/rrc.cvc.uab.es\/?ch=15."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"970","DOI":"10.1109\/TPAMI.2013.182","article-title":"Robust text detection in natural scene images","volume":"36","author":"Yin","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1509","DOI":"10.1109\/TIP.2017.2656474","article-title":"Scene text detection and segmentation based on cascaded convolution neural networks","volume":"26","author":"Tang","year":"2017","journal-title":"IEEE Trans. Image Process."},{"key":"ref_17","unstructured":"Xie, E., Zang, Y., Shao, S., Yu, G., Yao, C., and Li, G. (February, January 27). Scene text detection with supervised pyramid context network. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-19), Honolulu, HI, USA."},{"key":"ref_18","unstructured":"Hossari, M., Dev, S., Nicholson, M., McCabe, K., Nautiyal, A., Conran, C., Tang, J., Xu, W., and Piti\u00e9, F. (2018, January 6\u20137). ADNet: A deep network for detecting adverts. Proceedings of the 26th AIAI Irish Conference on Artificial Intelligence and Cognitive Science (AICS \u201918), Dublin, Ireland."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Neuhold, G., Ollmann, T., Bull, S.R., and Kontschieder, P. (2017, January 22\u201329). The mapillary vistas dataset for semantic understanding of street scenes. Proceedings of the IEEE International Conference on Computer Vision (ICCV\u201917), Venice, Italy.","DOI":"10.1109\/ICCV.2017.534"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft COCO: Common objects in context. Proceedings of the European Conference on Computer Vision (ECCV\u201914), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Dev, S., Hossari, M., Nicholson, M., McCabe, K., Nautiyal, A., Conran, C., Tang, J., Xu, W., and Piti\u00e9, F. (2019, January 27\u201331). The CASE dataset of candidate spaces for advert implantation. Proceedings of the International Conference on Machine Vision Applications (MVA \u201919), Tokyo, Japan.","DOI":"10.23919\/MVA.2019.8757977"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Recognition (CVPR \u201915), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Recognition (CVPR\u201917), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI \u201915), Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Skansi, S. (2018). Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence, Springer Nature.","DOI":"10.1007\/978-3-319-73004-2"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_27","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS \u201912), Lake Tahoe, NV, USA."},{"key":"ref_28","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Recognition (CVPR\u201916), Las Vegas, NA, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Doll\u00e1r, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated residual transformations for deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Recognition (CVPR\u201917), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_31","unstructured":"Zou, Z., Shi, Z., Guo, Y., and Ye, J. (2019). Object detection in 20 years: A survey. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Alganci, U., Soydas, M., and Sertel, E. (2020). Comparative research on deep learning approaches for airplane detection from very high-resolution satellite images. Remote Sens., 12.","DOI":"10.3390\/rs12030458"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., and Berg, A.C. (2016, January 23\u201328). SSD: Single shot multibox detector. Proceedings of the European Conference on Computer Vision (ECCV \u201916), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_34","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards real-time object detection with region proposal networks. Proceedings of the International Conference on Neural Information Processing Systems (NIPS \u201915), Montreal, QC, Canada."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Recognition (CVPR\u201916), Las Vegas, NA, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_36","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_37","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (2016, January 27\u201330). The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Recognition (CVPR\u201916), Las Vegas, NA, USA.","DOI":"10.1109\/CVPR.2016.350"},{"key":"ref_39","unstructured":"Dutta, A., Gupta, A., and Zissermann, A. (2020, January 30). VGG Image Annotator (VIA), Version: 1.0.6. Available online: http:\/\/www.robots.ox.ac.uk\/vgg\/software\/via."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1007\/s11263-019-01247-4","article-title":"Deep learning for generic object detection: A survey","volume":"128","author":"Liu","year":"2020","journal-title":"Int. J. Comput. Vis."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/16\/4587\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,11]],"date-time":"2024-08-11T22:39:06Z","timestamp":1723415946000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/16\/4587"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,15]]},"references-count":40,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2020,8]]}},"alternative-id":["s20164587"],"URL":"https:\/\/doi.org\/10.3390\/s20164587","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2020,8,15]]}}}