基于流形结构神经网络的服装图像集分类方法

计算机科学 ›› 2021, Vol. 48 ›› Issue (11A): 391-395.doi: 10.11896/jsjkx.201200127

• 图像处理& 多媒体技术 • 上一篇    下一篇

基于流形结构神经网络的服装图像集分类方法

程铭, 马佩, 何儒汉   

  1. 武汉纺织大学数学与计算机学院 武汉430200
  • 出版日期:2021-11-10 发布日期:2021-11-12
  • 通讯作者: 何儒汉(heruhan@wtu.edu.cn)
  • 作者简介:cm_jsw@163.com
  • 基金资助:
    国家自然科学基金面上项目(61170093)

Clothing Image Sets Classification Based on Manifold Structure Neural Network

CHENG Ming, MA Pei, HE Ru-han   

  1. School of Mathematics and Computer Science,Wuhan Textile University,Wuhan 430200,China
  • Online:2021-11-10 Published:2021-11-12
  • About author:CHENG Ming,born in 1996,postgraduate,is a student member of China Computer Federation.His main research interests include machine learning and computer vision.
    HE Ru-han,born in 1974,Ph.D,professor,master supervisor,is a member of China Computer Federation.His main research interests include machine learning,computer vision and multimedia retrieval.
  • Supported by:
    National Science Foundation of China(61170093).

摘要: 随着大规模时尚数据集的公开,基于深度学习的服装图像分类得到快速发展。然而,目前服装图像分类多数是在同一件服装具有单张的、正面或接近正面的图像的场景下进行分类,这导致了当视角发生变化时常出现服装图像误分类的情况,现实中服装具有的形变大、遮挡严重等特性进一步加剧了该问题。基于上述问题,提出了一种基于流形结构神经网络的服装图像集分类方法,利用流形空间更好地表示服装的内部结构特征。该方法选用多视角度服装图像集作为实验数据集,首先通过卷积神经网络提取服装图像集的浅层特征,再通过协方差池化将欧氏数据转换为流形数据,最后通过基于流形结构的神经网络学习服装图像集的内部结构特征,获取准确的分类结果。实验结果表明,所提方法在MVC数据集上的Precision、Recall和F-1指标可达到89.64%,89.12%和88.69%,与现有的图像集(视频)分类算法相比,其分别获得了2.04%,2.65%和2.70%的提升,该方法比已有算法更加准确、高效、鲁棒。

关键词: 服装图像集分类, 计算机视觉, 流形神经网络, 深度学习, 时尚分析

Abstract: Clothing classification based on deep learning has developed rapidly with the release of large-scale fashion data sets.However,most of the current clothing image classification methods are performed in a scene where the same clothing has a single,frontal or close-to-front image,which leads to misclassification of clothing when the view of clothing changes.In reality,the clothing features such as large deformation and severe occlusion further aggravate the problem.Therefore,a clothing image set classification method based on manifold structure neural network is proposed which uses manifold space to better represent the internal structure characteristics of clothing.Concretely,first,the shallow features of the clothing image set are extracted through the traditional convolutional neural network,and then the Euclidean feature data are converted into manifold data by using the covariance pooling.Finally,the internal manifold structures of clothing image sets are learned through the neural network based on manifold structure to obtain more accurate classification results.The experimental results show that the Precision,Recall and F-1 score of the proposed method on the MVC dataset can reach 89.64%,89.12% and 88.69%.Compared with the existing image sets (video) classification algorithms,the proposed method obtains an improvement of 2.04%,2.65% and 2.70%.It is illustrated that the proposed method is more accurate,efficient and robust than existing methods.

Key words: Clothing image set classification, Computer vision, Deep learning, Fashion analysis, Manifold neural network

中图分类号: 

  • TP391
[1]CHENG W H,SONG S,CHEN C Y,et al.Fashion Meets Computer Vision:A Survey[J].arXiv:2003.13988,2020.
[2]LEE S,OH S,JUNG C,et al.A global-local embedding module for fashion landmark detection [C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.2019.
[3]GE Y,ZHANG R,WANG X,et al.Deepfashion2:A versatile benchmark for detection,pose estimation,segmentation and re-identification of clothing images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:5337-5345.
[4]WANG W,ZHANG Z,QI S,et al.Learning compositional neural information fusion for human parsing[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:5703-5713.
[5]WANG W,XU Y,SHEN J,et al.Attentive fashion grammarnetwork for fashion landmark detection and clothing category classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4271-4280.
[6]MALL U,MATZEN K,HARIHARAN B,et al.Geostyle:Discovering fashion trends and events[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:411-420.
[7]WU B,CHENG W H,LIU P,et al.SMP challenge:An overview of social media prediction challenge 2019[C]//Proceedings of the 27th ACM International Conference on Multimedia.2019:2667-2671.
[8]DONG H,LIANG X,SHEN X,et al.Fw-gan:Flow-navigated warping gan for video virtual try-on[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:1161-1170.
[9]BALASKRISHNAN G,ZHAO A,DALCA A V,et al.Synthesizing images of humans in unseen poses[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8340-8348.
[10]SANTESTEBAN I,OTADUY M A,CASAS D.Learning-Based Animation of Clothing for Virtual Try-On[J].Computer Graphics Forum,2019,38(2):355-366.
[11]SONG X,HAS X,LI Y,et al.GP-BPR:Personalized Compatibility Modeling for Clothing Matching[C]//Proceedings of the 27th ACM International Conference on Multimedia.2019:320-328.
[12]DONG X,SONG X,FENG F,et al.Personalized Capsule Wardrobe Creation with Garment and User Modeling[C]//Proceedings of the 27th ACM International Conference on Multimedia.2019:302-310.
[13]LIU Z,LUO P,QIU S,et al.Deepfashion:Powering robustclothes recognition and retrieval with rich annotations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1096-1104.
[14]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[15]HE K,GKIOXARI G,DOLLAR P,et al.Mask r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2961-2969.
[16]LIU K H,CHEN T Y,CHEN C S.Mvc:A dataset for view-invariant clothing retrieval and attribute prediction[C]//Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval.2016:313-316.
[17]DONAHUE J,ANNE HENDRICKS L,GUADARRAMA S,et al.Long-term recurrent convolutional networks for visual recognition and description[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:2625-2634.
[18]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[19]TRAN D,BOURDEV L,FERGUS R,et al.Learning spatiotemporal features with 3d convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:4489-4497.
[20]SIMONYAN K,ZISSERMAN A.Two-stream convolutionalnetworks for action recognition in videos[C]//Advances in Neural Information Processing Systems.2014:568-576.
[21]HUANG Z,GOOL L V.A riemannian network for SPD matrix learning[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.2017:2036-2042.
[22]HUANG Z,WU J,GOOL L V.Building Deep Neural Networks on Grassmann Manifolds[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence.2018:3279-3286.
[23]CHAKRABORTY R,BOUZA J,MANTON J,et al.Manifoldnet:A deep neural network for manifold-valued data with applications[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence.2020.
[24]ACHARYA D,HUANG Z,PANI P D,et al.Covariance pooling for facial expression recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2018:367-374.
[25]WANG W,XU Y,SHEN J,et al.Attentive fashion grammarnetwork for fashion landmark detection and clothing category classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4271-4280.
[26]BRONSTEIN M M,BTUNA J,LECUN Y,et al.Geometricdeep learning:going beyond euclidean data[J].IEEE Signal Processing Magazine,2017,34(4):18-42.
[27]MASCI J,BOSCAINI D,BRONSTEIN M,et al.Geodesic convolutional neural networks on riemannian manifolds[C]//Proceedings of the IEEE international Conference on Computer Vision Workshops.2015:37-45.
[28]NHUYEN X S,BRUN L,LEZORAY O,et al.Skeleton-based hand gesture recognition by learning SPD matrices with neural networks[C]//2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).IEEE,2019:1-5.
[29]QIAO S,WANG R,SHAN S,et al.Deep heterogeneous hashing for face video retrieval[J].IEEE Transactions on Image Processing,2019,29:1299-1312.
[30]LI C,ZHANG B,CHEN C,et al.Deep manifold structure transfer for action recognition[J].IEEE Transactions on Image Processing,2019,28(9):4646-4658.
[31]WANG R,GUO H,DAVIS L S,et al.Covariance discriminative learning:A natural and efficient approach to image set classification[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:2496-2503.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[5] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[6] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[7] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[8] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[9] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[10] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[11] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[12] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[13] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[14] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
[15] 毛典辉, 黄晖煜, 赵爽.
符合监管合规性的自动合成新闻检测方法研究
Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance
计算机科学, 2022, 49(6A): 523-530. https://doi.org/10.11896/jsjkx.210300083
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!