Multimedia Feature Mapping and Correlation Learning for Cross-Modal Retrieval | IGI Global Scientific Publishing
Reference Hub1
Multimedia Feature Mapping and Correlation Learning for Cross-Modal Retrieval

Multimedia Feature Mapping and Correlation Learning for Cross-Modal Retrieval

Xu Yuan, Hua Zhong, Zhikui Chen, Fangming Zhong, Yueming Hu
Copyright: © 2018 |Volume: 10 |Issue: 3 |Pages: 17
ISSN: 1938-0259|EISSN: 1938-0267|EISBN13: 9781522543381|DOI: 10.4018/IJGHPC.2018070103
Cite Article Cite Article

MLA

Yuan, Xu, et al. "Multimedia Feature Mapping and Correlation Learning for Cross-Modal Retrieval." IJGHPC vol.10, no.3 2018: pp.29-45. https://doi.org/10.4018/IJGHPC.2018070103

APA

Yuan, X., Zhong, H., Chen, Z., Zhong, F., & Hu, Y. (2018). Multimedia Feature Mapping and Correlation Learning for Cross-Modal Retrieval. International Journal of Grid and High Performance Computing (IJGHPC), 10(3), 29-45. https://doi.org/10.4018/IJGHPC.2018070103

Chicago

Yuan, Xu, et al. "Multimedia Feature Mapping and Correlation Learning for Cross-Modal Retrieval," International Journal of Grid and High Performance Computing (IJGHPC) 10, no.3: 29-45. https://doi.org/10.4018/IJGHPC.2018070103

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

This article describes how with the rapid increasing of multimedia content on the Internet, the need for effective cross-modal retrieval has attracted much attention recently. Many related works ignore the latent semantic correlations of modalities in the non-linear space and the extraction of high-level modality features, which only focuses on the semantic mapping of modalities in linear space and the use of low-level artificial features as modality feature representation. To solve these issues, the authors first utilizes convolutional neural networks and topic modal to obtain a high-level semantic feature of various modalities. Sequentially, they propose a supervised learning algorithm based on a kernel with partial least squares that can capture semantic correlations across modalities. Finally, the joint model of different modalities is learnt by the training set. Extensive experiments are conducted on three benchmark datasets that include Wikipedia, Pascal and MIRFlickr. The results show that the proposed approach achieves better retrieval performance over several state-of-the-art approaches.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global Scientific Publishing bookstore.