{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,1,1]],"date-time":"2024-01-01T16:10:32Z","timestamp":1704125432120},"reference-count":64,"publisher":"Institution of Engineering and Technology (IET)","issue":"8","license":[{"start":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T00:00:00Z","timestamp":1684108800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":["ietresearch.onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["IET Computer Vision"],"published-print":{"date-parts":[[2023,12]]},"abstract":"Abstract<\/jats:title>Weakly supervised object detection (WSOD) is becoming increasingly important for computer vision tasks, as it alleviates the burden of manual annotation. Most WSOD techniques rely on multiple instance learning (MIL), which tends to localise the discriminative parts of salient objects instead of the whole object. In addition, network training is often supervised using simple image\u2010level annotations, without including object quantities or location information. However, this can lead to ambiguous differentiation of object instances, both in terms of location and semantics. To address these issues, propose an end\u2010to\u2010end triple critical feature capture network (TCFCNet) for WSOD is proposed. Specifically, a multi\u2010task branch, which can perform fully supervised classification and regression task, was integrated with a PCL in an end\u2010to\u2010end network for refining object locations in an online method. A cyclic parametric dropblock module (CPDM) was then designed to help the detector focus on the contextual information by using cyclic masking techniques to maximise the removal of the discriminative components of an object instance to alleviate the part domination problem. Finally, a feature decoupling module (FDM) is proposed to further reduce the ambiguous distinction of object instances by adaptively constructing robust critical features that adapt to multi\u2010task branch for classification and regression tasks, which contains a feature enhancement module and task\u2010specific polarisation functions. Comprehensive experiments are carried out on the challenging Pascal VOC 2007 and VOC 2012 datasets. The proposed method achieves a 54.6% mAP and a 44.3% mAP on the Pascal VOC 2007 and VOC 2012 datasets respectively, showed that our method outperformed existing mainstream techniques by a considerable margin.<\/jats:p>","DOI":"10.1049\/cvi2.12203","type":"journal-article","created":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T08:01:28Z","timestamp":1684137688000},"page":"895-912","update-policy":"http:\/\/dx.doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Triple critical feature capture network: A triple critical feature capture network for weakly supervised object detection"],"prefix":"10.1049","volume":"17","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-2302-4879","authenticated-orcid":false,"given":"Zhoufeng","family":"Liu","sequence":"first","affiliation":[{"name":"School of Electronic and Information Engineering Zhongyuan University of Technology Zhengzhou China"}]},{"given":"Kaihua","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Electronic and Information Engineering Zhongyuan University of Technology Zhengzhou China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-6543-1838","authenticated-orcid":false,"given":"Chunlei","family":"Li","sequence":"additional","affiliation":[{"name":"School of Electronic and Information Engineering Zhongyuan University of Technology Zhengzhou China"}]},{"given":"Shunmin","family":"Ding","sequence":"additional","affiliation":[{"name":"Department of Energy and Environment Zhongyuan University of Technology Zhengzhou China"}]},{"given":"Jiangtao","family":"Xi","sequence":"additional","affiliation":[{"name":"School of Electrical, Computer and Telecommunications Engineering University of Wollongong Wollongong New South Wales Australia"}]}],"member":"265","published-online":{"date-parts":[[2023,5,15]]},"reference":[{"key":"e_1_2_10_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"e_1_2_10_3_1","unstructured":"Thuan D.:Evolution of Yolo Algorithm and Yolov5: The State\u2010Of\u2010The\u2010Art Object Detention Algorithm(2021)"},{"key":"e_1_2_10_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/tcsvt.2020.2987465"},{"key":"e_1_2_10_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/tcsvt.2019.2900709"},{"key":"e_1_2_10_6_1","unstructured":"Ge Z. et\u00a0al.:Yolox: Exceeding Yolo Series in 2021(2021).arXiv preprint arXiv:2107.08430"},{"key":"e_1_2_10_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2022.3231744"},{"key":"e_1_2_10_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2022.3178180"},{"key":"e_1_2_10_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263\u2010015\u20100816\u2010y"},{"key":"e_1_2_10_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263\u2010019\u201001247\u20104"},{"key":"e_1_2_10_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.545"},{"key":"e_1_2_10_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.311"},{"key":"e_1_2_10_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.326"},{"key":"e_1_2_10_14_1","first-page":"1377","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Jie Z.","year":"2017"},{"key":"e_1_2_10_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.382"},{"key":"e_1_2_10_16_1","doi-asserted-by":"crossref","unstructured":"Chen M. et\u00a0al.:Online progressive instance\u2010balanced sampling for weakly supervised object detection(2022).arXiv preprint arXiv:2206.10324","DOI":"10.1109\/TIM.2023.3273655"},{"key":"e_1_2_10_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/tip.2020.2987161"},{"key":"e_1_2_10_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.169"},{"key":"e_1_2_10_19_1","article-title":"Faster R\u2010CNN: towards real\u2010time object detection with region proposal networks","volume":"28","author":"Ren S.","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_2_10_20_1","first-page":"1277","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Ge W.","year":"2018"},{"key":"e_1_2_10_21_1","first-page":"240","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Zhang X.","year":"2018"},{"key":"e_1_2_10_22_1","first-page":"8372","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Yang K.","year":"2019"},{"key":"e_1_2_10_23_1","first-page":"10598","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Ren Z.","year":"2020"},{"key":"e_1_2_10_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/tpami.2018.2876304"},{"key":"e_1_2_10_25_1","first-page":"1297","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Wan F.","year":"2018"},{"key":"e_1_2_10_26_1","doi-asserted-by":"crossref","unstructured":"Wang J. et\u00a0al.:Collaborative learning for weakly supervised object detection(2018).arXiv preprint arXiv:1802.03531","DOI":"10.24963\/ijcai.2018\/135"},{"key":"e_1_2_10_27_1","first-page":"8292","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Zeng Z.","year":"2019"},{"key":"e_1_2_10_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00619"},{"key":"e_1_2_10_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01158"},{"key":"e_1_2_10_30_1","unstructured":"Jang E. Gu S. Poole B.:Categorical re parameterization with Gumbel\u2010Softmax(2016).arXiv preprint arXiv:1611.01144"},{"key":"e_1_2_10_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3074313"},{"key":"e_1_2_10_32_1","article-title":"Weakly\u2010supervised discovery of visual pattern configurations","volume":"27","author":"Song H.O.","year":"2014","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_2_10_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298906"},{"key":"e_1_2_10_34_1","first-page":"4315","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Rochan M.","year":"2015"},{"key":"e_1_2_10_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19821-2_18"},{"key":"e_1_2_10_36_1","first-page":"16797","article-title":"Comprehensive attention self\u2010distillation for weakly\u2010supervised object detection","volume":"33","author":"Huang Z.","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_2_10_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00604"},{"key":"e_1_2_10_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00448"},{"key":"e_1_2_10_39_1","first-page":"928","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zhang Y.","year":"2018"},{"key":"e_1_2_10_40_1","first-page":"434","volume-title":"In Proceedings of the European Conference on Computer Vision (ECCV)","author":"Wei Y.","year":"2018"},{"key":"e_1_2_10_41_1","first-page":"350","volume-title":"European Conference on Computer Vision","author":"Kantorov V.","year":"2016"},{"key":"e_1_2_10_42_1","volume-title":"IEEE Transactions on Multimedia","author":"Wu Z.","year":"2021"},{"key":"e_1_2_10_43_1","first-page":"3534","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Kim D.","year":"2017"},{"key":"e_1_2_10_44_1","first-page":"2199","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wan F.","year":"2019"},{"key":"e_1_2_10_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW50498.2020.00392"},{"key":"e_1_2_10_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3056887"},{"key":"e_1_2_10_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01393"},{"key":"e_1_2_10_48_1","doi-asserted-by":"crossref","unstructured":"Shao F. et\u00a0al.:Deep Learning for Weakly\u2010Supervised Object Detection and Localization. A survey Neurocomputing(2022)","DOI":"10.1016\/j.neucom.2022.01.095"},{"key":"e_1_2_10_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00144"},{"key":"e_1_2_10_50_1","first-page":"9834","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Gao Y.","year":"2019"},{"key":"e_1_2_10_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.106"},{"key":"e_1_2_10_52_1","article-title":"Object detection via region\u2010based fully convolutional networks","volume":"29","author":"Dai J.","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_2_10_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01020"},{"key":"e_1_2_10_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.527"},{"key":"e_1_2_10_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/tpami.2015.2389824"},{"key":"e_1_2_10_56_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263\u2010013\u20100620\u20105"},{"key":"e_1_2_10_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/tcsvt.2022.3168547"},{"key":"e_1_2_10_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/tgrs.2021.3095186"},{"key":"e_1_2_10_59_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263\u2010009\u20100275\u20104"},{"issue":"1","key":"e_1_2_10_60_1","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava N.","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_10_61_1","article-title":"Dropblock: a regularization method for convolutional networks","volume":"31","author":"Ghiasi G.","year":"2018","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_2_10_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01301"},{"key":"e_1_2_10_63_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2020.08.028"},{"key":"e_1_2_10_64_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00079"},{"key":"e_1_2_10_65_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_26"}],"container-title":["IET Computer Vision"],"original-title":[],"language":"en","deposited":{"date-parts":[[2024,1,1]],"date-time":"2024-01-01T15:36:30Z","timestamp":1704123390000},"score":1,"resource":{"primary":{"URL":"https:\/\/ietresearch.onlinelibrary.wiley.com\/doi\/10.1049\/cvi2.12203"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,15]]},"references-count":64,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2023,12]]}},"alternative-id":["10.1049\/cvi2.12203"],"URL":"https:\/\/doi.org\/10.1049\/cvi2.12203","archive":["Portico"],"relation":{},"ISSN":["1751-9632","1751-9640"],"issn-type":[{"value":"1751-9632","type":"print"},{"value":"1751-9640","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,15]]},"assertion":[{"value":"2023-01-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-19","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-05-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}