Abstract
In this paper, we address the problem of large scale cross-scenario clothing retrieval with semantic-preserving visual phrases (SPVP). Since the human parts are important cues for clothing detection and segmentation, we firstly detect human parts as the semantic context, and refine the regions of human parts with sparse background reconstruction. Then, the semantic parts are encoded into the vocabulary tree under the bag-of-visual-word (BOW) framework, and the contextual constraint of visual words among different human parts is exploited through the SPVP. Moreover, the SPVP is integrated into the inverted index structure for accelerating the retrieval process. Experiments and comparisons on our clothing dataset indicate that the SPVP significantly enhances the discriminative power of local features with a slight increase of memory usage or runtime consumption compared to the BOW model. Therefore, the approach is superior to both the state-of-the-art approach and two clothing search engines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: ICCV (2003)
Chen, H., Xu, Z., Liu, Z., Zhu, S.: Composite templates for cloth modeling and sketching. In: CVPR (2006)
Hasan, B., Hogg, D.: Segmentation using deformable spatial priors with application to clothing. In: BMVC (2010)
Wang, N., Ai, H.: Who blocks who: Simultaneous clothing segmentation for grouping images. In: ICCV (2011)
Yang, M., Yu, K.: Real-time clothing recognition in surveillance videos. In: ICIP (2011)
Wang, X., Zhang, T.: Clothes search in consumer photos via color matching and attribute learning. In: ACM Multimedia (2011)
Huang, L., Xia, T., Zhang, Y., Lin, S.: Finding Suits in Images of People. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 485–494. Springer, Heidelberg (2012)
Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., Yan, S.: Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set. In: CVPR (2012)
Zheng, Y., Zhao, M., Neo, S., Chua, T., Tian, Q.: Visual synset: Towards a higher-level visual representation. In: CVPR (2008)
Cao, Y., Wang, C., Li, Z., Zhang, L., Zhang, L.: Spatial-bag-of-features. In: CVPR (2010)
Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: CVPR (2009)
Yuan, J., Wu, Y., Yang, M.: Discovery of collocation patterns: from visual words to visual phrases. In: CVPR (2007)
Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: CVPR (2011)
Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: CVPR (2011)
Zhang, Z., Liang, X., Ganesh, A., Ma, Y.: TILT: Transform Invariant Low-Rank Textures. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 314–328. Springer, Heidelberg (2011)
Hoyer, P.: Non-negative sparse coding. In: IEEE Workshop on Neural Networks for Signal Processing (2002)
Li, Z., Yang, Y., Liu, J., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: AAAI (2012)
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)
Fu, J., Wang, J., Lu, H.: Effective logo retrieval with adaptive local feature selection. In: ACM Multimedia (2010)
Nisterand, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)
Fu, J., Wang, J., Zhang, Y., Lu, H.: Point-context descriptor based region search for logo recognition. In: ACM ICIMCS (2012)
Bourdev, L., Maji, S., Malik, J.: Describing people: a poselet-based approach to attribute classification. In: ICCV (2011)
Gallagher, A., Chen, T.: Clothing cosegmentation for recognizing people. In: CVPR (2008)
Siddiquie, B., Feris, R., Davis, L.: Image ranking and retrieval based on multi-attribute queries. In: CVPR (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fu, J., Wang, J., Li, Z., Xu, M., Lu, H. (2013). Efficient Clothing Retrieval with Semantic-Preserving Visual Phrases. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37444-9_33
Download citation
DOI: https://doi.org/10.1007/978-3-642-37444-9_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37443-2
Online ISBN: 978-3-642-37444-9
eBook Packages: Computer ScienceComputer Science (R0)