{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,20]],"date-time":"2024-07-20T07:11:35Z","timestamp":1721459495335},"reference-count":67,"publisher":"Association for Computing Machinery (ACM)","issue":"1","funder":[{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["61836011, 61822208, and 61632019"],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004739","name":"Youth Innovation Promotion Association CAS","doi-asserted-by":"crossref","award":["2018497"],"id":[{"id":"10.13039\/501100004739","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2021,2,28]]},"abstract":"Affinity, which represents whether two pixels belong to a same instance, is an equivalent representation to the instance segmentation labels. Conventional works do not make an explicit exploration on the affinity. In this article, we present two instance segmentation schemes based on pixel affinity information and show the effectiveness of affinity in both aspects. For proposal-free method, we predict pixel affinity for each image and then propose a simple yet effective graph merge algorithm to cluster pixels into instances. It shows that the affinity is powerful as an instance-relevant information to guide the clustering procedure in proposal-free instance segmentation. For proposal-based methods, we extend conventional framework with affinity head and introduce affinity as attached supervision in training phase. Without any additional inference cost, we can improve the performance of existing proposal-based instance segmentation methods, which shows that the affinity can also be applied as an auxiliary loss and training with such extra loss is beneficial to the training progress. Experimental results show that our schemes achieve comparable performance to other state-of-the-art instance segmentation methods. With Cityscapes training data, the proposed proposal-free method achieves 28.8 AP and the proposal-based method gets 27.2 AP both on test sets.<\/jats:p>","DOI":"10.1145\/3407090","type":"journal-article","created":{"date-parts":[[2021,4,16]],"date-time":"2021-04-16T12:42:08Z","timestamp":1618576928000},"page":"1-20","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Affinity Derivation for Accurate Instance Segmentation"],"prefix":"10.1145","volume":"17","author":[{"given":"Yiding","family":"Liu","sequence":"first","affiliation":[{"name":"University of Science and Technology of China, China"}]},{"given":"Siyu","family":"Yang","sequence":"additional","affiliation":[{"name":"Airbnb Information Technology (Beijing) Co., Ltd., China"}]},{"given":"Bin","family":"Li","sequence":"additional","affiliation":[{"name":"Microsoft Research, China"}]},{"given":"Wengang","family":"Zhou","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China and Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, China"}]},{"given":"Jizheng","family":"Xu","sequence":"additional","affiliation":[{"name":"Microsoft Research, China"}]},{"given":"Houqiang","family":"Li","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China and Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, China"}]},{"given":"Yan","family":"Lu","sequence":"additional","affiliation":[{"name":"Microsoft Research, China"}]}],"member":"320","published-online":{"date-parts":[[2021,4,16]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00523"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.100"},{"key":"e_1_2_1_3_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Bai M.","unstructured":"M. Bai and R. Urtasun . 2017. Deep watershed transform for instance segmentation . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . 2858\u20132866. M. Bai and R. Urtasun. 2017. Deep watershed transform for instance segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 2858\u20132866."},{"key":"e_1_2_1_4_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW\u201917)","author":"Brabandere B. D.","unstructured":"B. D. Brabandere , D. Neven , and L. V. Gool . 2017. Semantic instance segmentation for autonomous driving . In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW\u201917) . 478\u2013480. B. D. Brabandere, D. Neven, and L. V. Gool. 2017. Semantic instance segmentation for autonomous driving. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW\u201917). 478\u2013480."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"e_1_2_1_6_1","volume-title":"Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587","author":"Chen Liang-Chieh","year":"2017","unstructured":"Liang-Chieh Chen , George Papandreou , Florian Schroff , and Hartwig Adam . 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 ( 2017 ). Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00215"},{"key":"e_1_2_1_9_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916)","author":"Cordts M.","unstructured":"M. Cordts , M. Omran , S. Ramos , T. Rehfeld , M. Enzweiler , R. Benenson , U. Franke , S. Roth , and B. Schiele . 2016. The cityscapes dataset for semantic urban scene understanding . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916) . 3213\u20133223. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916). 3213\u20133223."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46466-4_32"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299025"},{"key":"e_1_2_1_12_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916)","author":"Dai J.","unstructured":"J. Dai , K. He , and J. Sun . 2016. Instance-aware semantic segmentation via multi-task network cascades . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916) . 3150\u20133158. J. Dai, K. He, and J. Sun. 2016. Instance-aware semantic segmentation via multi-task network cascades. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916). 3150\u20133158."},{"key":"e_1_2_1_13_1","unstructured":"Jifeng Dai Yi Li Kaiming He and Jian Sun. 2016. R-FCN: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems (NeurlIPS\u201916). 379\u2013387. Jifeng Dai Yi Li Kaiming He and Jian Sun. 2016. R-FCN: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems (NeurlIPS\u201916). 379\u2013387."},{"key":"e_1_2_1_14_1","volume-title":"IEEE International Conference on Computer Vision (ICCV\u201917)","author":"Dai J.","unstructured":"J. Dai , H. Qi , Y. Xiong , Y. Li , G. Zhang , H. Hu , and Y. Wei . 2017. Deformable convolutional networks . In IEEE International Conference on Computer Vision (ICCV\u201917) . 764\u2013773. J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. 2017. Deformable convolutional networks. In IEEE International Conference on Computer Vision (ICCV\u201917). 764\u2013773."},{"key":"e_1_2_1_15_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201914)","author":"Erhan D.","unstructured":"D. Erhan , C. Szegedy , A. Toshev , and D. Anguelov . 2014. Scalable object detection using deep neural networks . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201914) . 2155\u20132162. D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov. 2014. Scalable object detection using deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201914). 2155\u20132162."},{"key":"e_1_2_1_16_1","volume-title":"Sergio Guadarrama, and Kevin P. Murphy.","author":"Fathi Alireza","year":"2017","unstructured":"Alireza Fathi , Zbigniew Wojna , Vivek Rathod , Peng Wang , Hyun Oh Song , Sergio Guadarrama, and Kevin P. Murphy. 2017 . Semantic instance segmentation via deep metric learning. arXiv preprint arXiv:1703.10277 (2017). Alireza Fathi, Zbigniew Wojna, Vivek Rathod, Peng Wang, Hyun Oh Song, Sergio Guadarrama, and Kevin P. Murphy. 2017. Semantic instance segmentation via deep metric learning. arXiv preprint arXiv:1703.10277 (2017)."},{"key":"e_1_2_1_17_1","volume-title":"Stacked deconvolutional network for semantic segmentation. arXiv preprint arXiv:1708.04943","author":"Fu Jun","year":"2017","unstructured":"Jun Fu , Jing Liu , Yuhang Wang , and Hanqing Lu. 2017. Stacked deconvolutional network for semantic segmentation. arXiv preprint arXiv:1708.04943 ( 2017 ). Jun Fu, Jing Liu, Yuhang Wang, and Hanqing Lu. 2017. Stacked deconvolutional network for semantic segmentation. arXiv preprint arXiv:1708.04943 (2017)."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00073"},{"key":"e_1_2_1_19_1","volume-title":"A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857","author":"Garcia-Garcia Alberto","year":"2017","unstructured":"Alberto Garcia-Garcia , Sergio Orts-Escolano , Sergiu Oprea , Victor Villena-Martinez , and Jose Garcia-Rodriguez . 2017. A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857 ( 2017 ). Alberto Garcia-Garcia, Sergio Orts-Escolano, Sergiu Oprea, Victor Villena-Martinez, and Jose Garcia-Rodriguez. 2017. A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857 (2017)."},{"key":"e_1_2_1_20_1","volume-title":"Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV\u201915)","author":"Girshick R.","year":"2015","unstructured":"R. Girshick . 2015 . Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV\u201915) . 1440\u20131448. R. Girshick. 2015. Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV\u201915). 1440\u20131448."},{"key":"e_1_2_1_21_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201914)","author":"Girshick R.","unstructured":"R. Girshick , J. Donahue , T. Darrell , and J. Malik . 2014. Rich feature hierarchies for accurate object detection and semantic segmentation . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201914) . 580\u2013587. R. Girshick, J. Donahue, T. Darrell, and J. Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201914). 580\u2013587."},{"key":"e_1_2_1_22_1","unstructured":"Ross Girshick Ilija Radosavovic Georgia Gkioxari Piotr Doll\u00e1r and Kaiming He. 2018. Detectron. Retrieved from https:\/\/github.com\/facebookresearch\/detectron. Ross Girshick Ilija Radosavovic Georgia Gkioxari Piotr Doll\u00e1r and Kaiming He. 2018. Detectron. Retrieved from https:\/\/github.com\/facebookresearch\/detectron."},{"key":"e_1_2_1_23_1","volume-title":"IEEE International Conference on Computer Vision (ICCV)","volume":"2","author":"Grauman K.","unstructured":"K. Grauman and T. Darrell . 2005. The pyramid match kernel: Discriminative classification with sets of image features . In IEEE International Conference on Computer Vision (ICCV) , Vol. 2 . 1458\u20131465. K. Grauman and T. Darrell. 2005. The pyramid match kernel: Discriminative classification with sets of image features. In IEEE International Conference on Computer Vision (ICCV), Vol. 2. 1458\u20131465."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126343"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10584-0_20"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298642"},{"key":"e_1_2_1_27_1","volume-title":"Shape-aware instance segmentation. arXiv preprint arXiv:1612.03129","author":"Hayder Zeeshan","year":"2016","unstructured":"Zeeshan Hayder , Xuming He , and Mathieu Salzmann . 2016. Shape-aware instance segmentation. arXiv preprint arXiv:1612.03129 ( 2016 ). Zeeshan Hayder, Xuming He, and Mathieu Salzmann. 2016. Shape-aware instance segmentation. arXiv preprint arXiv:1612.03129 (2016)."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.70"},{"key":"e_1_2_1_29_1","volume-title":"Mask R-CNN. In IEEE International Conference on Computer Vision (ICCV\u201917)","author":"He K.","unstructured":"K. He , G. Gkioxari , P. Doll\u00e1r , and R. Girshick . 2017 . Mask R-CNN. In IEEE International Conference on Computer Vision (ICCV\u201917) . 2980\u20132988. K. He, G. Gkioxari, P. Doll\u00e1r, and R. Girshick. 2017. Mask R-CNN. In IEEE International Conference on Computer Vision (ICCV\u201917). 2980\u20132988."},{"key":"e_1_2_1_30_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916)","author":"He K.","unstructured":"K. He , X. Zhang , S. Ren , and J. Sun . 2016. Deep residual learning for image recognition . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916) . 770\u2013778. K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916). 770\u2013778."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2018.8489379"},{"key":"e_1_2_1_32_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Huang J.","unstructured":"J. Huang , V. Rathod , C. Sun , M. Zhu , A. Korattikara , A. Fathi , I. Fischer , Z. Wojna , Y. Song , S. Guadarrama , and K. Murphy . 2017. Speed\/accuracy trade-offs for modern convolutional object detectors . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . 3296\u20133297. J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara, A. Fathi, I. Fischer, Z. Wojna, Y. Song, S. Guadarrama, and K. Murphy. 2017. Speed\/accuracy trade-offs for modern convolutional object detectors. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 3296\u20133297."},{"key":"e_1_2_1_33_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Islam M. A.","unstructured":"M. A. Islam , M. Rochan , N. D. B. Bruce , and Y. Wang . 2017. Gated feedback refinement network for dense image labeling . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . 4877\u20134885. M. A. Islam, M. Rochan, N. D. B. Bruce, and Y. Wang. 2017. Gated feedback refinement network for dense image labeling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 4877\u20134885."},{"key":"e_1_2_1_34_1","volume-title":"Object detection free instance segmentation with labeling transformations. arXiv preprint arXiv:1611.08991","author":"Jin Long","year":"2016","unstructured":"Long Jin , Zeyu Chen , and Zhuowen Tu. 2016. Object detection free instance segmentation with labeling transformations. arXiv preprint arXiv:1611.08991 ( 2016 ). Long Jin, Zeyu Chen, and Zhuowen Tu. 2016. Object detection free instance segmentation with labeling transformations. arXiv preprint arXiv:1611.08991 (2016)."},{"key":"e_1_2_1_35_1","volume-title":"European Conference on Computer Vision (ECCV\u201918)","author":"Ke Tsung-Wei","unstructured":"Tsung-Wei Ke , Jyh-Jing Hwang , Ziwei Liu , and Stella X. Yu . 2018. Adaptive affinity fields for semantic segmentation . In European Conference on Computer Vision (ECCV\u201918) . 605\u2013621. Tsung-Wei Ke, Jyh-Jing Hwang, Ziwei Liu, and Stella X. Yu. 2018. Adaptive affinity fields for semantic segmentation. In European Conference on Computer Vision (ECCV\u201918). 605\u2013621."},{"key":"e_1_2_1_36_1","volume-title":"Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. arXiv preprint arXiv:1705.07115 3","author":"Kendall Alex","year":"2017","unstructured":"Alex Kendall , Yarin Gal , and Roberto Cipolla . 2017. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. arXiv preprint arXiv:1705.07115 3 ( 2017 ). Alex Kendall, Yarin Gal, and Roberto Cipolla. 2017. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. arXiv preprint arXiv:1705.07115 3 (2017)."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.204"},{"key":"e_1_2_1_38_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Kirillov A.","unstructured":"A. Kirillov , E. Levinkov , B. Andres , B. Savchynskyy , and C. Rother . 2017. InstanceCut: From edges to instances with multicut . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . 7322\u20137331. A. Kirillov, E. Levinkov, B. Andres, B. Savchynskyy, and C. Rother. 2017. InstanceCut: From edges to instances with multicut. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 7322\u20137331."},{"key":"e_1_2_1_39_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201906)","volume":"2","author":"Lazebnik S.","unstructured":"S. Lazebnik , C. Schmid , and J. Ponce . 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201906) , Vol. 2 . 2169\u20132178. S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201906), Vol. 2. 2169\u20132178."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1989.1.4.541"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.206"},{"key":"e_1_2_1_42_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Li Y.","unstructured":"Y. Li , H. Qi , J. Dai , X. Ji , and Y. Wei . 2017. Fully convolutional instance-aware semantic segmentation . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . 4438\u20134446. Y. Li, H. Qi, J. Dai, X. Ji, and Y. Wei. 2017. Fully convolutional instance-aware semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 4438\u20134446."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01240-3_21"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2775623"},{"key":"e_1_2_1_45_1","volume-title":"Proposal-free network for instance-level object segmentation. arXiv preprint arXiv:1509.02636","author":"Liang Xiaodan","year":"2015","unstructured":"Xiaodan Liang , Yunchao Wei , Xiaohui Shen , Jianchao Yang , Liang Lin , and Shuicheng Yan . 2015. Proposal-free network for instance-level object segmentation. arXiv preprint arXiv:1509.02636 ( 2015 ). Xiaodan Liang, Yunchao Wei, Xiaohui Shen, Jianchao Yang, Liang Lin, and Shuicheng Yan. 2015. Proposal-free network for instance-level object segmentation. arXiv preprint arXiv:1509.02636 (2015)."},{"key":"e_1_2_1_46_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916)","author":"Lin G.","unstructured":"G. Lin , C. Shen , A. v. d. Hengel , and I. Reid . 2016. Efficient piecewise training of deep structured models for semantic segmentation . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916) . 3194\u20133203. G. Lin, C. Shen, A. v. d. Hengel, and I. Reid. 2016. Efficient piecewise training of deep structured models for semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916). 3194\u20133203."},{"key":"e_1_2_1_47_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Lin T. Y.","unstructured":"T. Y. Lin , P. Doll\u00e1r , R. Girshick , K. He , B. Hariharan , and S. Belongie . 2017. Feature pyramid networks for object detection . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . 936\u2013944. T. Y. Lin, P. Doll\u00e1r, R. Girshick, K. He, B. Hariharan, and S. Belongie. 2017. Feature pyramid networks for object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 936\u2013944."},{"key":"e_1_2_1_48_1","volume-title":"IEEE International Conference on Computer Vision (ICCV\u201917)","author":"Liu S.","unstructured":"S. Liu , J. Jia , S. Fidler , and R. Urtasun . 2017. SGN: Sequential grouping networks for instance segmentation . In IEEE International Conference on Computer Vision (ICCV\u201917) . 3516\u20133524. S. Liu, J. Jia, S. Fidler, and R. Urtasun. 2017. SGN: Sequential grouping networks for instance segmentation. In IEEE International Conference on Computer Vision (ICCV\u201917). 3516\u20133524."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00913"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.342"},{"key":"e_1_2_1_51_1","volume-title":"European Conference on Computer Vision (ECCV\u201916)","author":"Liu Wei","unstructured":"Wei Liu , Dragomir Anguelov , Dumitru Erhan , Christian Szegedy , Scott Reed , Cheng-Yang Fu , and Alexander C. Berg . 2016. SSD: Single shot multibox detector . In European Conference on Computer Vision (ECCV\u201916) . 21\u201337. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision (ECCV\u201916). 21\u201337."},{"key":"e_1_2_1_52_1","volume-title":"Berg","author":"Liu Wei","year":"2015","unstructured":"Wei Liu , Andrew Rabinovich , and Alexander C . Berg . 2015 . ParseNet : Looking wider to see better. arXiv preprint arXiv:1506.04579 (2015). Wei Liu, Andrew Rabinovich, and Alexander C. Berg. 2015. ParseNet: Looking wider to see better. arXiv preprint arXiv:1506.04579 (2015)."},{"key":"e_1_2_1_53_1","volume-title":"IEEE International Conference on Computer Vision (ICCV\u201915)","author":"Liu Z.","unstructured":"Z. Liu , X. Li , P. Luo , C. C. Loy , and X. Tang . 2015. Semantic image segmentation via deep parsing network . In IEEE International Conference on Computer Vision (ICCV\u201915) . 1377\u20131385. Z. Liu, X. Li, P. Luo, C. C. Loy, and X. Tang. 2015. Semantic image segmentation via deep parsing network. In IEEE International Conference on Computer Vision (ICCV\u201915). 1377\u20131385."},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00904"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"e_1_2_1_56_1","unstructured":"Pedro O. Pinheiro Ronan Collobert and Piotr Doll\u00e1r. 2015. Learning to segment object candidates. In Advances in Neural Information Processing Systems (NIPS\u201915). 1990\u20131998. Pedro O. Pinheiro Ronan Collobert and Piotr Doll\u00e1r. 2015. Learning to segment object candidates. In Advances in Neural Information Processing Systems (NIPS\u201915). 1990\u20131998."},{"key":"e_1_2_1_57_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916)","author":"Redmon J.","unstructured":"J. Redmon , S. Divvala , R. Girshick , and A. Farhadi . 2016. You only look once: Unified, real-time object detection . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916) . 779\u2013788. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. 2016. You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201916). 779\u2013788."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2577031"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2572683"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00066"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-45886-1_2"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00272"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00902"},{"key":"e_1_2_1_65_1","volume-title":"Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122","author":"Yu Fisher","year":"2015","unstructured":"Fisher Yu and Vladlen Koltun . 2015. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 ( 2015 ). Fisher Yu and Vladlen Koltun. 2015. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)."},{"key":"e_1_2_1_66_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Zhao H.","unstructured":"H. Zhao , J. Shi , X. Qi , X. Wang , and J. Jia . 2017. Pyramid scene parsing network . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917) . 6230\u20136239. H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. 2017. Pyramid scene parsing network. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 6230\u20136239."},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2018.8545708"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3407090","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T12:58:44Z","timestamp":1672577924000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3407090"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,28]]},"references-count":67,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,2,28]]}},"alternative-id":["10.1145\/3407090"],"URL":"https:\/\/doi.org\/10.1145\/3407090","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,28]]},"assertion":[{"value":"2019-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-04-16","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}