{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,23]],"date-time":"2024-09-23T04:10:48Z","timestamp":1727064648157},"reference-count":57,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2019,6,7]],"date-time":"2019-06-07T00:00:00Z","timestamp":1559865600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012639","name":"Prince Sultan University","doi-asserted-by":"publisher","award":["RIOTU Lab"],"id":[{"id":"10.13039\/501100012639","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"Segmenting aerial images is of great potential in surveillance and scene understanding of urban areas. It provides a mean for automatic reporting of the different events that happen in inhabited areas. This remarkably promotes public safety and traffic management applications. After the wide adoption of convolutional neural networks methods, the accuracy of semantic segmentation algorithms could easily surpass 80% if a robust dataset is provided. Despite this success, the deployment of a pretrained segmentation model to survey a new city that is not included in the training set significantly decreases accuracy. This is due to the domain shift between the source dataset on which the model is trained and the new target domain of the new city images. In this paper, we address this issue and consider the challenge of domain adaptation in semantic segmentation of aerial images. We designed an algorithm that reduces the domain shift impact using generative adversarial networks (GANs). In the experiments, we tested the proposed methodology on the International Society for Photogrammetry and Remote Sensing (ISPRS) semantic segmentation dataset and found that our method improves overall accuracy from 35% to 52% when passing from the Potsdam domain (considered as source domain) to the Vaihingen domain (considered as target domain). In addition, the method allows efficiently recovering the inverted classes due to sensor variation. In particular, it improves the average segmentation accuracy of the inverted classes due to sensor variation from 14% to 61%.<\/jats:p>","DOI":"10.3390\/rs11111369","type":"journal-article","created":{"date-parts":[[2019,6,7]],"date-time":"2019-06-07T09:25:56Z","timestamp":1559899556000},"page":"1369","source":"Crossref","is-referenced-by-count":169,"title":["Unsupervised Domain Adaptation Using Generative Adversarial Networks for Semantic Segmentation of Aerial Images"],"prefix":"10.3390","volume":"11","author":[{"given":"Bilel","family":"Benjdira","sequence":"first","affiliation":[{"name":"Robotics and internet of things Laboratory, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia"},{"name":"Research Laboratory Smart Electricity & ICT, SEICT, LR18ES44, National Engineering School of Carthage, University of Carthage, Carthage 1054, Tunisia"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-9287-0596","authenticated-orcid":false,"given":"Yakoub","family":"Bazi","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia"}]},{"given":"Anis","family":"Koubaa","sequence":"additional","affiliation":[{"name":"Prince Sultan University, Saudi Arabia\/Gaitech Robotics, China\/CISTER, INESC-TEC, ISEP, Polytechnic Institute of Porto, 4200-465 Porto, Portugal"}]},{"given":"Kais","family":"Ouni","sequence":"additional","affiliation":[{"name":"Research Laboratory Smart Electricity & ICT, SEICT, LR18ES44, National Engineering School of Carthage, University of Carthage, Carthage 1054, Tunisia"}]}],"member":"1968","published-online":{"date-parts":[[2019,6,7]]},"reference":[{"key":"ref_1","first-page":"1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 22\u201325). Pyramid Scene Parsing Network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1016\/j.neucom.2019.02.003","article-title":"Survey on semantic segmentation using deep learning techniques","volume":"338","author":"Lateef","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Lecture Notes in Computer Science, Springer International Publishing.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_9","unstructured":"Gerke, M. (2014). Use of the Stair Vision Library within the ISPRS 2D Semantic Labeling Benchmark (Vaihingen), ResearcheGate."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (2016, January 27\u201330). The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.350"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Richter, S.R., Vineet, V., Roth, S., and Koltun, V. (2016). Playing for data: Ground truth from computer games. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46475-6_7"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Ros, G., Sellart, L., Materzynska, J., Vazquez, D., and Lopez, A.M. (2016, January 27\u201330). The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.352"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Saleh, F., Aliakbarian, M.S., Salzmann, M., Petersson, L., Gould, S., and Alvarez, J.M. (2016). Built-in foreground\/background prior for weakly-supervised semantic segmentation. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46484-8_25"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bearman, A., Russakovsky, O., Ferrari, V., and Fei-Fei, L. (2016). What\u2019s the point: Semantic segmentation with point supervision. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46478-7_34"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Shimoda, W., and Yanai, K. (2016). Distinct class-specific saliency maps for weakly supervised semantic segmentation. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46493-0_14"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Tzeng, E., Hoffman, J., Darrell, T., and Saenko, K. (2015, January 7\u201313). Simultaneous deep transfer across domains and tasks. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.463"},{"key":"ref_17","unstructured":"Long, M., Cao, Y., Wang, J., and Jordan, M.I. (arXiv, 2015). Learning transferable features with deep adaptation networks, arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Tzeng, E., Hoffman, J., Saenko, K., and Darrell, T. (2017, January 21\u201326). Adversarial discriminative domain adaptation. Proceedings of the Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.316"},{"key":"ref_19","unstructured":"Luo, Z., Zou, Y., Hoffman, J., and Fei-Fei, L.F. (2017, January 4\u20139). Label efficient learning of transferable representations acrosss domains and tasks. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_20","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014, January 8\u201313). Generative adversarial nets. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhu, J.Y., Park, T., Isola, P., and Efros, A.A. (2017, January 22\u201329). Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.244"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1109\/MSP.2014.2347059","article-title":"Visual Domain Adaptation: A survey of recent advances","volume":"32","author":"Patel","year":"2015","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Saenko, K., Kulis, B., Fritz, M., and Darrell, T. (2010). Adapting visual category models to new domains. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-642-15561-1_16"},{"key":"ref_24","first-page":"1","article-title":"Domain-adversarial training of neural networks","volume":"17","author":"Ganin","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_25","unstructured":"Ganin, Y., and Lempitsky, V. (arXiv, 2014). Unsupervised domain adaptation by backpropagation, arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., and Krishnan, D. (2017, January 21\u201326). Unsupervised pixel-level domain adaptation with generative adversarial networks. Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.18"},{"key":"ref_27","unstructured":"Hoffman, J., Tzeng, E., Park, T., Zhu, J.Y., Isola, P., Saenko, K., Efros, A.A., and Darrell, T. (arXiv, 2017). Cycada: Cycle-consistent adversarial domain adaptation, arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"797","DOI":"10.1109\/TPAMI.2013.163","article-title":"Virtual and real world adaptation for pedestrian detection","volume":"36","author":"Vazquez","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Peng, X., and Saenko, K. (2018, January 12\u201315). Synthetic to real adaptation with generative correlation alignment networks. Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00219"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., and Webb, R. (2017, January 21\u201326). Learning from simulated and unsupervised images through adversarial training. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.241"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Shafaei, A., Little, J.J., and Schmidt, M. (arXiv, 2016). Play and learn: Using video games to train computer vision models, arXiv.","DOI":"10.5244\/C.30.26"},{"key":"ref_32","unstructured":"Hoffman, J., Wang, D., Yu, F., and Darrell, T. (arXiv, 2016). Fcns in the wild: Pixel-level adversarial and constraint-based adaptation, arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhang, Y., David, P., and Gong, B. (2017, January 22\u201329). Curriculum domain adaptation for semantic segmentation of urban scenes. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.223"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Chen, Y., Li, W., and Van Gool, L. (2018, January 18\u201323). Road: Reality oriented adaptation for semantic segmentation of urban scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00823"},{"key":"ref_35","unstructured":"Sankaranarayanan, S., Balaji, Y., Jain, A., Lim, S.N., and Chellappa, R. (arXiv, 2017). Unsupervised domain adaptation for semantic segmentation with gans, arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Tsai, Y.H., Hung, W.C., Schulter, S., Sohn, K., Yang, M.H., and Chandraker, M. (2018, January 18\u201323). Learning to adapt structured output space for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00780"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Huang, H., Huang, Q., and Krahenbuhl, P. (2018, January 8\u201314). Domain transfer through deep activation matching. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01270-0_36"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Qiu, Z., Yao, T., Liu, D., and Mei, T. (2018, January 18\u201323). Fully Convolutional Adaptation Networks for Semantic Segmentation. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00712"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Oliehoek, F.A., Savani, R., Gallego, J., van der Pol, E., and Gro\u00df, R. (arXiv, 2018). Beyond Local Nash Equilibria for Adversarial Networks, arXiv.","DOI":"10.1007\/978-3-030-31978-6_7"},{"key":"ref_40","unstructured":"Goodfellow, I.J. (arXiv, 2016). NIPS 2016 Tutorial: Generative Adversarial Networks, arXiv."},{"key":"ref_41","unstructured":"Liu, M.Y., Breuel, T., and Kautz, J. (2017, January 4\u20139). Unsupervised Image-to-Image Translation Networks. Proceedings of the NIPS Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_42","unstructured":"Zhu, J.Y., Zhang, R., Pathak, D., Darrell, T., Efros, A.A., Wang, O., and Shechtman, E. (2017, January 4\u20139). Toward Multimodal Image-to-Image Translation. Proceedings of the NIPS Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Yi, Z., Zhang, H., Tan, P., and Gong, M. (2017, January 22\u201329). DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.310"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J.Y., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-Image Translation with Conditional Adversarial Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Bashmal, L., Bazi, Y., AlHichri, H., AlRahhal, M.M., Ammour, N., and Alajlan, N. (2018). Siamese-GAN: Learning Invariant Representations for Aerial Vehicle Image Categorization. Remote Sens., 10.","DOI":"10.3390\/rs10020351"},{"key":"ref_46","unstructured":"Xu, B., Wang, N., Chen, T., and Li, M. (arXiv, 2015). Empirical Evaluation of Rectified Activations in Convolutional Network, arXiv."},{"key":"ref_47","first-page":"1929","article-title":"Dropout: A simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_48","unstructured":"Ulyanov, D., Vedaldi, A., and Lempitsky, V. (arXiv, 2016). Instance Normalization: The Missing Ingredient for Fast Stylization, arXiv."},{"key":"ref_49","unstructured":"Ioffe, S., and Szegedy, C. (arXiv, 2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, arXiv."},{"key":"ref_50","unstructured":"Xiang, S., and Li, H. (arXiv, 2017). On the effects of batch and weight normalization in generative adversarial networks, arXiv."},{"key":"ref_51","unstructured":"Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. (2018). BiSeNet: Bilateral Segmentation. Lecture Notes in Computer Science, Springer International Publishing."},{"key":"ref_52","unstructured":"(2019, March 28). Real-Time Semantic Segmentation on Cityscapes. Available online: https:\/\/paperswithcode.com\/sota\/real-time-semantic-segmentation-cityscap."},{"key":"ref_53","unstructured":"(2019, March 28). Real-Time Semantic Segmentation on Cityscapes. Available online: https:\/\/github.com\/GeorgeSeif\/Semantic-Segmentation-Suite."},{"key":"ref_54","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M. (2016, January 2\u20134). TensorFlow: A system for large-scale machine learning. Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_56","unstructured":"Kingma, D.P., and Ba, J. (arXiv, 2014). Adam: A Method for Stochastic Optimization, arXiv."},{"key":"ref_57","unstructured":"Chollet, F. (2019, June 06). Keras. Available online: https:\/\/github.com\/fchollet\/keras."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/11\/1369\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,19]],"date-time":"2024-07-19T09:17:28Z","timestamp":1721380648000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/11\/1369"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,6,7]]},"references-count":57,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2019,6]]}},"alternative-id":["rs11111369"],"URL":"https:\/\/doi.org\/10.3390\/rs11111369","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,6,7]]}}}