Markov random field based fusion for supervised and semi-supervised multi-modal image classification

Xie, Liang; Pan, Peng; Lu, Yansheng

doi:10.1007/s11042-014-2018-y

Markov random field based fusion for supervised and semi-supervised multi-modal image classification

Published: 28 May 2014

Volume 74, pages 613–634, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Liang Xie¹,
Peng Pan¹ &
Yansheng Lu¹

622 Accesses
Explore all metrics

Abstract

In recent years, there has been a massive explosion of multimedia content on the web, multi-modal examples such as images associated with tags can be easily accessed from social website such as Flickr. In this paper, we consider two classification tasks: supervised and semi-supervised multi-modal image classification, to take advantage of the increasing multi-modal examples on the web. We first propose a Markov random field (MRF) based fusion method: discriminative probabilistic graphical fusion (DPGF) for the supervised multi-modal image classification, which can make use of the associated tags to enhance the classification performance. Based on DPGF, we then propose a three-step learning procedure: DPGF+RLS+SVM, for the semi-supervised multi-modal image classification, which uses both the labeled and unlabeled examples for training. Experimental results on two datasets: PASCAL VOC’07 and MIR Flickr, show that our methods can well exploit the multi-modal data and unlabeled examples, and they also outperform previous state-of-the-art methods in both two multi-modal image classification. Finally we consider the weakly supervised condition where class labels are from image tags which are noisy. Our semi-supervised approach also improves the classification performance in this case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval

Article 19 February 2019

Discrete Semi-supervised Multi-label Learning for Image Classification

Multi-instance Multi-label Learning for Image Categorization Based on Integrated Contextual Information

References

Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: a survey. Multimed Syst 16(6):345–379
Article Google Scholar
Bach FR, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21st international conference on machine learning. ACM, p 6
Baluja S (1998) Probabilistic modeling for face orientation discrimination: learning from labeled and unlabeled data. NIPS
Barla A, Odone F, Verri A (2003) Histogram intersection kernel for image classification. In: Proceedings of the international conference on image processing, ICIP 2003, vol 3. IEEE
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
MATH MathSciNet Google Scholar
Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 1. Springer, New York
Google Scholar
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th annual conference on computational learning theory. ACM, pp 92-100
Cai D, He X, Han J (2007) Semi-supervised discriminant analysis. In: IEEE 11th international conference on computer vision, ICCV 2007. IEEE, pp 1–7
Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27
Google Scholar
Chang S-F, Manmatha R, Chua T-S (2005) Combining text audio-visual features in video indexing. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’05), vol 5. IEEE
Chapelle O, Haffner P, Vapnik VN (1999) Support vector machines for histogram-based image classification. IEEE Trans Neural Netw 10(5):1055–1064
Article Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, 2005, CVPR, vol 1. IEEE, pp 886–893
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2007) The PASCAL Visual Object Classes Challenge (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
Gao Y, Wang M, Zha Z-J, Shen J, Li X, Wu X (2013) Visual-textual joint relevance learning for tag-based social image search, p 1
Goumehei E, Tolpekin VA (2010) Contextual image classification with support vector machine
Guillaumin M, Verbeek J, Schmid C (2010) Multimodal semi-supervised learning for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR). IEEE
Hammersley JM, Clifford P (1968) Markov fields on finite graphs and lattices
Huiskes MJ, Lew MS (2008) The MIR flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval. ACM
Chapelle O, Scholkopf B, Zien A (eds) (2006) Semi-supervised learning. MIT press, Cambridge
Iyengar G, Nock H, Neti C, Franz M (2002) In: Proceedings of IEEE international conference on multimedia and expo, 2002 ICME02, vol 2. IEEE, pp 369–372
Kawanabe M, Binder A, Muller C, Wojcikiewicz W (2011) Multi-modal visual concept classification of images via Markov random walk over tags. In: IEEE workshop on applications of computer vision (WACV). IEEE, pp 396–401
Li S Z (1995) Markov random field modeling in computer vision. Springer, New York
Book Google Scholar
Li Y, Crandall DJ, Huttenlocher DP (2009) Landmark classification in large-scale image collections. In: IEEE 12th international conference on computer vision. IEEE, pp 1957–1964
Lienhart R, Romberg S, H?rster E (2009) Multilayer pLSA for multimodal image retrieval. In: Proceedings of the ACM international conference on image and video retrieval. ACM, p 9
Lin HT, Lin CJ, Weng RC (2007) A note on Platts probabilistic outputs for support vector machines[J]. Mach Learn 68(3):267–276
Article Google Scholar
Liu N, Dellandrea E, Zhu C, Bichot C-E, Chen L (2012) A selective weighted late fusion for visual concept recognition. In: Workshops and demonstrations omputer Vision CECCV. Springer, Berlin Heidelberg, pp 426–435
Nigam K, McCallum AK, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39(2–3):103–134
Article MATH Google Scholar
Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. Computer Vision CECCV 2006. Springer, Berlin Heidelberg, pp 490–503
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. In: International journal of computer vision 42.3, pp 145–175
Pang Y, Ma Z, Yuan Y, Li X, Wang K (2011) Multimodal learning for multi-label image classification. In: 18th IEEE international conference on image processing (ICIP), 2011. IEEE, pp 1797–1800
Papadopoulos S, Zigkolis C, Kompatsiaris Y, Vakali A (2010) Cluster-based landmark and event detection on tagged photo collections. IEEE Multimedia
Perronnin F, Snchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: Computer Vision CECCV 2010. Springer, Berlin Heidelberg, pp 143–156
Sindhwani V, Niyogi P, Belkin M (2005) A co-regularization approach to semi-supervised learning with multiple views. In: Proceedings of ICML workshop on learning with multiple views, pp 74–79
Snoek CGM, Worring M, Arnold WMS (2005) Early versus late fusion in semantic video analysis. In: Proceedings of the 13th annual ACM international conference on multimedia. ACM
Srivastava N, Salakhutdinov R (2012) Multimodal learning with deep Boltzmann machines. In: Advances in neural information processing systems, p 25
Sun S (2011) Multi-view Laplacian support vector machines. In: Advanced data mining and applications. Springer, Berlin Heidelberg, pp 209–222
Verbeek J, Guillaumin M, Mensink T et al (2010) Image annotation with tagprop on the MIRFLICKR set. In: Proceedings of the international conference on multimedia information retrieval. ACM, pp 537–546
Wang G, Hoiem D, Forsyth D (2009) Building text features for object image classification. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009, IEEE, pp 1367–1374
Wang J, Yang J, Kai Y, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3360–3367
Xiang Y, Zhou X, Chua T-S, Ngo C-W (2009) A revisit of generative model for automatic image annotation using markov random fields. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009, IEEE, pp 1153–1160
Yang J, Li Y, Tian Y, Duan L, Gao W (2009) Group-sensitive multiple kernel learning for object categorization. In: IEEE 12th international conference on computer vision. IEEE, pp 436–443
Znaidia A, Shabou A, Popescu A, Le Borgne H, Hudelot C (2012) Multimodal feature generation framework for semantic image classification. In: Proceedings of the 2nd ACM international conference on multimedia retrieval. ACM

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China, 430074
Liang Xie, Peng Pan & Yansheng Lu

Authors

Liang Xie
View author publications
You can also search for this author in PubMed Google Scholar
Peng Pan
View author publications
You can also search for this author in PubMed Google Scholar
Yansheng Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Pan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, L., Pan, P. & Lu, Y. Markov random field based fusion for supervised and semi-supervised multi-modal image classification. Multimed Tools Appl 74, 613–634 (2015). https://doi.org/10.1007/s11042-014-2018-y

Download citation

Received: 24 October 2013
Revised: 08 April 2014
Accepted: 14 April 2014
Published: 28 May 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s11042-014-2018-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Markov random field based fusion for supervised and semi-supervised multi-modal image classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval

Discrete Semi-supervised Multi-label Learning for Image Classification

Multi-instance Multi-label Learning for Image Categorization Based on Integrated Contextual Information

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Markov random field based fusion for supervised and semi-supervised multi-modal image classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval

Discrete Semi-supervised Multi-label Learning for Image Classification

Multi-instance Multi-label Learning for Image Categorization Based on Integrated Contextual Information

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation