Abstract
Multimedia recommendation aims to predict whether users will interact with multimodal items. A few recent works that explicitly learn the semantic structure between items using multimodal features manifest impressive performance gains. This is mainly attributed to the capability of graph convolutional networks (GCNs) to learn superior item representations by propagating and aggregating information from high-order neighbors on the semantic structure. However, they still suffer from two major challenges: a) the noisy relations (edges) in the item-item semantic structure disrupt information propagation and generate low-quality item representations, which impairs the effectiveness and robustness of existing methods; b) the lack of an optimization objective that exploits informative samples and global preference information leads to suboptimal training of the model, which makes users and items indistinguishable in the embedding space. To overcome these challenges, we propose Enhancing Multi media Recommendation through Item-Item Semantic Denoising and Global Preference Awareness (MMGPA). Specifically, the model contains the following two components: (1) a modal semantic representation network is carefully designed to learn the high-quality multimodal representation of items by modeling the denoised item-item semantic structure, and (2) a global preference-aware optimization objective prioritizes the most informative hard sample pairs while constraining the multiple preference distances to better separate the embedding space. Extensive experimental results demonstrate that the proposed method outperforms various state-of-the-art competitors on three public benchmark datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, J., Zhang, H., He, X., Nie, L., Liu, W., Chua, T.: Attentive collaborative filtering: multimedia recommendation with item- and component-level attention. In: SIGIR, pp. 335–344. ACM (2017)
Chen, M., Wei, Z., Huang, Z., Ding, B., Li, Y.: Simple and deep graph convolutional networks. In: ICML. Proceedings of Machine Learning Research, vol. 119, pp. 1725–1735. PMLR (2020)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for contrastive learning of visual representations. In: ICML. Proceedings of Machine Learning Research, vol. 119, pp. 1597–1607. PMLR (2020)
Chen, Y., Wu, L., Zaki, M.J.: Iterative deep graph learning for graph neural networks: better and robust node embeddings. In: NeurIPS (2020)
Ding, J., Quan, Y., Yao, Q., Li, Y., Jin, D.: Simplify and robustify negative sampling for implicit collaborative filtering. In: NeurIPS (2020)
Gao, Z., Cheng, Z., Pérez, F., Sun, J., Volkovs, M.: MCL: mixed-centric loss for collaborative filtering. In: WWW, pp. 2339–2347. ACM (2022)
He, R., McAuley, J.J.: VBPR: visual Bayesian personalized ranking from implicit feedback. In: AAAI, pp. 144–150. AAAI Press (2016)
He, X., Chen, T., Kan, M., Chen, X.: Trirank: review-aware explainable recommendation by modeling aspects. In: CIKM, pp. 1661–1670. ACM (2015)
He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., Wang, M.: Lightgcn: simplifying and powering graph convolution network for recommendation. In: SIGIR, pp. 639–648. ACM (2020)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (Poster). OpenReview.net (2017)
Liu, Q., Wu, S., Wang, L.: Deepstyle: learning user preferences for visual recommendation. In: SIGIR, pp. 841–844. ACM (2017)
Ma, C., Ma, L., Zhang, Y., Tang, R., Liu, X., Coates, M.: Probabilistic metric learning with adaptive margin for top-k recommendation. In: KDD, pp. 1036–1044. ACM (2020)
McAuley, J.J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: SIGIR, pp. 43–52. ACM (2015)
McPherson, M., Smith-Lovin, L., Cook, J.M.: Birds of a feather: homophily in social networks. Ann. Rev. Sociol. 27(1), 415–444 (2001)
Mu, Z., Zhuang, Y., Tan, J., Xiao, J., Tang, S.: Learning hybrid behavior patterns for multimedia recommendation. In: ACM Multimedia, pp. 376–384. ACM (2022)
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. In: EMNLP/IJCNLP (1), pp. 3980–3990. Association for Computational Linguistics (2019)
Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: BPR: bayesian personalized ranking from implicit feedback. In: UAI, pp. 452–461. AUAI Press (2009)
Rong, Y., Huang, W., Xu, T., Huang, J.: Dropedge: towards deep graph convolutional networks on node classification. In: ICLR. OpenReview.net (2020)
Sohn, K.: Improved deep metric learning with multi-class n-pair loss objective. In: NIPS, pp. 1849–1857 (2016)
Song, H.O., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: CVPR, pp. 4004–4012. IEEE Computer Society (2016)
Wang, X., He, X., Wang, M., Feng, F., Chua, T.: Neural graph collaborative filtering. In: SIGIR, pp. 165–174. ACM (2019)
Wang, X., Han, X., Huang, W., Dong, D., Scott, M.R.: Multi-similarity loss with general pair weighting for deep metric learning. In: CVPR, pp. 5022–5030. Computer Vision Foundation/IEEE (2019)
Wei, Y., Wang, X., Nie, L., He, X., Chua, T.: Graph-refined convolutional network for multimedia recommendation with implicit feedback. In: ACM Multimedia, pp. 3541–3549. ACM (2020)
Wei, Y., Wang, X., Nie, L., He, X., Hong, R., Chua, T.: MMGCN: multi-modal graph convolution network for personalized recommendation of micro-video. In: ACM Multimedia, pp. 1437–1445. ACM (2019)
Wu, F., Jr., A.H.S., Zhang, T., Fifty, C., Yu, T., Weinberger, K.Q.: Simplifying graph convolutional networks. In: ICML. Proceedings of Machine Learning Research, vol. 97, pp. 6861–6871. PMLR (2019)
Wu, J., Wang, X., Feng, F., He, X., Chen, L., Lian, J., Xie, X.: Self-supervised graph learning for recommendation. In: SIGIR, pp. 726–735. ACM (2021)
Zhang, J., Zhu, Y., Liu, Q., Wu, S., Wang, S., Wang, L.: Mining latent structures for multimedia recommendation. In: ACM Multimedia, pp. 3872–3880. ACM (2021)
Zhang, J., Zhu, Y., Liu, Q., Zhang, M., Wu, S., Wang, L.: Latent structure mining with contrastive modality fusion for multimedia recommendation. IEEE Trans. Knowl. Data Eng. (2022)
Acknowledgments
This work is supported by the National Natural Science Foundation of China No. 62272332, the Major Program of the Natural Science Foundation of Jiangsu Higher Education Institutions of China No. 22KJA520006.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Y., Zheng, S., Zhou, Q., Chen, W., Zhao, L. (2023). Enhancing Multimedia Recommendation Through Item-Item Semantic Denoising and Global Preference Awareness. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14176. Springer, Cham. https://doi.org/10.1007/978-3-031-46661-8_53
Download citation
DOI: https://doi.org/10.1007/978-3-031-46661-8_53
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46660-1
Online ISBN: 978-3-031-46661-8
eBook Packages: Computer ScienceComputer Science (R0)