{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,1,16]],"date-time":"2025-01-16T05:27:47Z","timestamp":1737005267073,"version":"3.33.0"},"reference-count":42,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2023,3,31]],"date-time":"2023-03-31T00:00:00Z","timestamp":1680220800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Innovation Research Fund of Agricultural Information Institute of CAAS, China","award":["CAAS-ASTIP-2016-AII"]},{"name":"Central Public-interest Scientific Institution Basal Research Fund","award":["JBYW-AII-2022-02"]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"It is very significant for rural planning to accurately count the number and area of rural homesteads by means of automation. The development of deep learning makes it possible to achieve this goal. At present, many effective works have been conducted to extract building objects from VHR images using semantic segmentation technology, but they do not extract instance objects and do not work for densely distributed and overlapping rural homesteads. Most of the existing mainstream instance segmentation frameworks are based on the top-down structure. The model is complex and requires a large number of manually set thresholds. In order to solve the above difficult problems, we designed a simple query-based instance segmentation framework, QueryFormer, which includes an encoder and a decoder. A multi-scale deformable attention mechanism is incorporated into the encoder, resulting in significant computational savings, while also achieving effective results. In the decoder, we designed multiple groups, and used a Many-to-One label assignment method to make the image feature region be queried faster. Experiments show that our method achieves better performance (52.8AP) than the other most advanced models (+0.8AP) in the task of extracting rural homesteads in dense regions. This study shows that query-based instance segmentation framework has strong application potential in remote sensing images.<\/jats:p>","DOI":"10.3390\/s23073643","type":"journal-article","created":{"date-parts":[[2023,3,31]],"date-time":"2023-03-31T12:27:27Z","timestamp":1680265647000},"page":"3643","source":"Crossref","is-referenced-by-count":4,"title":["A Query-Based Network for Rural Homestead Extraction from VHR Remote Sensing Images"],"prefix":"10.3390","volume":"23","author":[{"given":"Ren","family":"Wei","sequence":"first","affiliation":[{"name":"Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing 100876, China"},{"name":"Key Laboratory of Agricultural Blockchain Application, Ministry of Agriculture and Rural Affairs, Beijing 100125, China"}]},{"given":"Beilei","family":"Fan","sequence":"additional","affiliation":[{"name":"Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing 100876, China"},{"name":"Key Laboratory of Agricultural Blockchain Application, Ministry of Agriculture and Rural Affairs, Beijing 100125, China"}]},{"given":"Yuting","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing 100876, China"},{"name":"Key Laboratory of Agricultural Blockchain Application, Ministry of Agriculture and Rural Affairs, Beijing 100125, China"}]},{"given":"Rongchao","family":"Yang","sequence":"additional","affiliation":[{"name":"Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing 100876, China"},{"name":"Key Laboratory of Agricultural Blockchain Application, Ministry of Agriculture and Rural Affairs, Beijing 100125, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4775","DOI":"10.1109\/TGRS.2017.2700322","article-title":"Deep Feature Fusion for VHR Remote Sensing Scene Classification","volume":"55","author":"Chaib","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1016\/j.isprsjprs.2021.03.016","article-title":"A Global Context-aware and Batch-independent Network for road extraction from VHR satellite imagery","volume":"175","author":"Zhu","year":"2021","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"6524","DOI":"10.1109\/TGRS.2020.2977248","article-title":"Object-Oriented Key Point Vector Distance for Binary Land Cover Change Detection Using VHR Remote Sensing Images","volume":"58","author":"Lv","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Yi, Y., Zhang, Z., Zhang, W., Zhang, C., Li, W., and Zhao, T. (2019). Semantic Segmentation of Urban Buildings from VHR Remote Sensing Imagery Using a Deep Convolutional Neural Network. Remote Sens., 11.","DOI":"10.3390\/rs11151774"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.inffus.2016.03.003","article-title":"A review of remote sensing image fusion methods","volume":"32","author":"Ghassemian","year":"2016","journal-title":"Inf. Fusion"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1940","DOI":"10.1109\/TGRS.2003.814625","article-title":"Classification and feature extraction for remote sensing images from urban areas based on morphological transformations","volume":"41","author":"Benediktsson","year":"2003","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3804","DOI":"10.1109\/TGRS.2008.922034","article-title":"Spectral and Spatial Classification of Hyperspectral Data Using SVMs and Morphological Profiles","volume":"46","author":"Fauvel","year":"2008","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1016\/j.isprsjprs.2015.03.011","article-title":"Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach","volume":"105","author":"Du","year":"2015","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Yuan, Q., and Mohd Shafri, H.Z. (2022). Multi-Modal Feature Fusion Network with Adaptive Center Point Detector for Building Instance Extraction. Remote Sens., 14.","DOI":"10.3390\/rs14194920"},{"key":"ref_10","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst., 25."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","first-page":"640","article-title":"Fully Convolutional Networks for Semantic Segmentation","volume":"39","author":"Long","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201323). Cascade r-cnn: Delving into high quality object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Chen, K., Pang, J., Wang, J., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Shi, J., and Ouyang, W. (2019, January 15\u201320). Hybrid task cascade for instance segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00511"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Wu, Y., He, K., and Girshick, R. (2020, January 13\u201319). PointRend: Image Segmentation as Rendering. Proceedings of the Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00982"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Bolya, D., Zhou, C., Xiao, F., and Lee, Y.J. (2019, January 15\u201320). Yolact: Real-time instance segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Long Beach, CA, USA.","DOI":"10.1109\/ICCV.2019.00925"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chen, H., Sun, K., Tian, Z., Shen, C., Huang, Y., and Yan, Y. (2020, January 13\u201319). Blendmask: Top-down meets bottom-up for instance segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00860"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Wang, X., Kong, T., Shen, C., Jiang, Y., and Li, L. (2020, January 23\u201328). SOLO: Segmenting Objects by Locations. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58523-5_38"},{"key":"ref_20","first-page":"17721","article-title":"SOLOv2: Dynamic and Fast Instance Segmentation","volume":"33","author":"Wang","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_21","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021). An Image is Worth 16 \u00d7 16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_22","unstructured":"Kirillov, A., Usunier, N., Carion, N., Zagoruyko, S., Synnaeve, G., and Massa, F. (2020, January 23\u201328). End-to-End Object Detection with Transformers. Proceedings of the European Conference on Computer Vision, Glasgow, UK."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Fang, Y., Yang, S., Wang, X., Li, Y., Fang, C., Shan, Y., Feng, B., and Liu, W. (2021, January 10\u201317). Instances as Queries. Proceedings of the International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00683"},{"key":"ref_24","first-page":"17864","article-title":"Per-pixel classification is not all you need for semantic segmentation","volume":"34","author":"Cheng","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., and Girdhar, R. (2022, January 21\u201324). Masked-attention mask transformer for universal image segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00135"},{"key":"ref_26","first-page":"21898","article-title":"Solq: Segmenting objects by learning queries","volume":"34","author":"Dong","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Fang, F., Wu, K., Liu, Y., Li, S., Wan, B., Chen, Y., and Zheng, D. (2021). A Coarse-to-Fine Contour Optimization Network for Extracting Building Instances from High-Resolution Remote Sensing Imagery. Remote Sens., 13.","DOI":"10.3390\/rs13193814"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Li, Y., Xu, W., Chen, H., Jiang, J., and Li, X. (2021). A Novel Framework Based on Mask R-CNN and Histogram Thresholding for Scalable Segmentation of New and Old Rural Buildings. Remote Sens., 13.","DOI":"10.3390\/rs13061070"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wu, T., Hu, Y., Peng, L., and Chen, R. (2020). Improved Anchor-Free Instance Segmentation for Building Extraction from High-Resolution Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12182910"},{"key":"ref_30","first-page":"1","article-title":"Building Instance Extraction Method Based on Improved Hybrid Task Cascade","volume":"19","author":"Liu","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1109\/TGRS.2018.2858817","article-title":"Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set","volume":"57","author":"Ji","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"2611","DOI":"10.1109\/JSTARS.2021.3058097","article-title":"Attention-Gate-Based Encoder\u2013Decoder Network for Automatical Building Extraction","volume":"14","author":"Deng","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhou, J., Liu, Y., Nie, G., Cheng, H., Yang, X., Chen, X., and Gross, L. (2022). Building Extraction and Floor Area Estimation at the Village Level in Rural China Via a Comprehensive Method Integrating UAV Photogrammetry and the Novel EDSANet. Remote Sens., 14.","DOI":"10.3390\/rs14205175"},{"key":"ref_34","first-page":"1","article-title":"CSA-UNet: Channel-Spatial Attention-Based Encoder\u2013Decoder Network for Rural Blue-Roofed Building Extraction From UAV Imagery","volume":"19","author":"Shi","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Wei, R., Fan, B., Wang, Y., Zhou, A., and Zhao, Z. (2022). MBNet: Multi-Branch Network for Extraction of Rural Homesteads Based on Aerial Images. Remote Sens., 14.","DOI":"10.3390\/rs14102443"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_37","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Milletari, F., Navab, N., and Ahmadi, S.-A. (2016, January 25\u201328). V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. Proceedings of the International Conference on 3d Vision, Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.79"},{"key":"ref_39","unstructured":"Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., Zhang, H., Lan, Y., Wang, L., and Liu, T. (2020, January 13\u201318). On Layer Normalization in the Transformer Architecture. Proceedings of the International Conference on Machine Learning, Virtual."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs","volume":"40","author":"Chen","year":"2016","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_41","unstructured":"Loshchilov, I., and Hutter, F. (2018). Decoupled Weight Decay Regularization. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/7\/3643\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,15]],"date-time":"2025-01-15T11:44:25Z","timestamp":1736941465000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/7\/3643"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,31]]},"references-count":42,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2023,4]]}},"alternative-id":["s23073643"],"URL":"https:\/\/doi.org\/10.3390\/s23073643","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2023,3,31]]}}}