Lightweight Modality Adaptation to Sequential Recommendation via Correlation Supervision

Hu, Hengchang; Liu, Qijiong; Li, Chuang; Kan, Min-Yen

doi:10.1007/978-3-031-56027-9_8

Hengchang Hu¹⁴,
Qijiong Liu¹⁵,
Chuang Li¹⁴ &
…
Min-Yen Kan¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14608))

Included in the following conference series:

European Conference on Information Retrieval

1091 Accesses

Abstract

In Sequential Recommenders (SR), encoding and utilizing modalities in an end-to-end manner is costly in terms of modality encoder sizes. Two-stage approaches can mitigate such concerns, but they suffer from poor performance due to modality forgetting, where the sequential objective overshadows modality representation. We propose a lightweight knowledge distillation solution that preserves both merits: retaining modality information and maintaining high efficiency. Specifically, we introduce a novel method that enhances the learning of embeddings in SR through the supervision of modality correlations. The supervision signals are distilled from the original modality representations, including both (1) holistic correlations, which quantify their overall associations, and (2) dissected correlation types, which refine their relationship facets (honing in on specific aspects like color or shape consistency).To further address the issue of modality forgetting, we propose an asynchronous learning step, allowing the original information to be retained longer for training the representation learning module. Our approach is compatible with various backbone architectures and outperforms the top baselines by 6.8% on average. We empirically demonstrate that preserving original feature associations from modality encoders significantly boosts task-specific recommendation adaptation. Additionally, we find that larger modality encoders (e.g., Large Language Models) contain richer feature sets which necessitate more fine-grained modeling to reach their full performance potential.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 9380; Price includes VAT (Japan)

Softcover Book: JPY 11725; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multifactorial modality fusion network for multimodal recommendation

Article 12 December 2024

TransRec: Learning Transferable Recommendation from Mixture-of-Modality Feedback

MDAP: A Multi-view Disentangled and Adaptive Preference Learning Framework for Cross-Domain Recommendation

References

Cao, Y., et al.: Semi-supervised knowledge distillation for tiny defect detection. In: 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 1010–1015. IEEE (2022)
Google Scholar
Chen, X., Cao, Q., Zhong, Y., Zhang, J., Gao, S., Tao, D.: Dearkd: data-efficient early knowledge distillation for vision transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12052–12062 (2022)
Google Scholar
Chen, X., Zhang, Y., Xu, H., Qin, Z., Zha, H.: Adversarial distillation for efficient recommendation with external knowledge. ACM Trans. Inf. Syst. (TOIS) 37(1), 1–28 (2018)
Article Google Scholar
Cohen, I., et al.: Pearson correlation coefficient. Noise Reduct. Speech Process. 1–4 (2009)
Google Scholar
Covington, P., Adams, J., Sargin, E.: Deep neural networks for youtube recommendations. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp. 191–198 (2016)
Google Scholar
Elsayed, S., Brinkmeyer, L., Schmidt-Thieme, L.: End-to-end image-based fashion recommendation. In: Corona Pampin, H.J., Shirvany, R. (eds.) RECSYS 2022, vol. 981, pp. 109–119. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-22192-7_7
Chapter Google Scholar
Gao, Q., Zhao, Y., Li, G., Tong, T.: Image super-resolution using knowledge distillation. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11362, pp. 527–541. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20890-5_34
Chapter Google Scholar
Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. ACM Trans. Interact. Intell. Syst. (TIIS) 5(4), 1–19 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of the 25th International Conference on World Wide Web, pp. 507–517 (2016)
Google Scholar
He, R., McAuley, J.: VBPR: visual bayesian personalized ranking from implicit feedback. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)
Google Scholar
He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., Wang, M.: LightGCN: simplifying and powering graph convolution network for recommendation. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 639–648 (2020)
Google Scholar
Hidasi, B., Karatzoglou, A., Baltrunas, L., Tikk, D.: Session-based recommendations with recurrent neural networks (2016)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Hou, Y., He, Z., McAuley, J., Zhao, W.X.: Learning vector-quantized item representation for transferable sequential recommenders. In: Proceedings of the ACM Web Conference 2023, pp. 1162–1171 (2023)
Google Scholar
Hu, H., Guo, W., Liu, Y., Kan, M.Y.: Adaptive multi-modalities fusion in sequential recommendation systems. In: Proceedings of the 32nd ACM International Conference on Information & Knowledge Management (2023)
Google Scholar
Hu, H., Pan, L., Ran, Y., Kan, M.Y.: Modeling and leveraging prerequisite context in recommendation. In: Proceedings of the 16th ACM Conference on Recommender Systems, Context-Aware Recommender System Workshop (2022)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2010)
Article Google Scholar
Jiao, X., et al.: Tinybert: distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2020)
Kang, W.C., Fang, C., Wang, Z., McAuley, J.: Visually-aware fashion recommendation and design with generative image models. In: Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), pp. 207–216. IEEE (2017)
Google Scholar
Kang, W.C., McAuley, J.: Self-attentive sequential recommendation. In: Proceedings of the 2018 International Conference on Data Mining (ICDM), pp. 197–206. IEEE (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2015)
Google Scholar
Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Article Google Scholar
Lee, S.H., Kim, D.H., Song, B.C.: Self-supervised knowledge distillation using singular value decomposition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 335–350 (2018)
Google Scholar
Lee, Y., Jang, K., Goo, J., Jung, Y., Kim, H.: Fithubert: going thinner and deeper for knowledge distillation of speech self-supervised learning (2022)
Google Scholar
Lian, D., Wang, H., Liu, Z., Lian, J., Chen, E., Xie, X.: Lightrec: a memory and search-efficient recommender system. In: Proceedings of The Web Conference 2020, pp. 695–705 (2020)
Google Scholar
Liu, C., Li, X., Cai, G., Dong, Z., Zhu, H., Shang, L.: Noninvasive self-attention for side information fusion in sequential recommendation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 4249–4256 (2021)
Google Scholar
Liu, D., Cheng, P., Dong, Z., He, X., Pan, W., Ming, Z.: A general knowledge distillation framework for counterfactual recommendation via uniform data. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 831–840 (2020)
Google Scholar
Liu, F., Cheng, Z., Sun, C., Wang, Y., Nie, L., Kankanhalli, M.: User diverse preference modeling by multimodal attentive metric learning. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1526–1534 (2019)
Google Scholar
Liu, Q., Zhu, J., Dai, Q., Wu, X.: Boosting deep CTR prediction with a plug-and-play pre-trainer for news recommendation. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 2823–2833 (2022)
Google Scholar
Liu, S., Chen, Z., Liu, H., Hu, X.: User-video co-attention network for personalized micro-video recommendation. In: The World Wide Web Conference, pp. 3020–3026 (2019)
Google Scholar
Liu, Y., et al.: End-to-end speech translation with knowledge distillation (2019)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Liu, Z., Ma, Y., Schubert, M., Ouyang, Y., Xiong, Z.: Multi-modal contrastive pre-training for recommendation. In: Proceedings of the 2022 International Conference on Multimedia Retrieval, pp. 99–108 (2022)
Google Scholar
McAuley, J., Targett, C., Shi, Q., Van Den Hengel, A.: Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 43–52 (2015)
Google Scholar
Oramas, S., Nieto, O., Sordo, M., Serra, X.: A deep multimodal approach for cold-start music recommendation. In: Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems, pp. 32–37 (2017)
Google Scholar
Park, M., Lee, K.: Exploiting negative preference in content-based music recommendation with contrastive learning. In: Proceedings of the 16th ACM Conference on Recommender Systems, pp. 229–236 (2022)
Google Scholar
Raffel, C.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(1), 5485–5551 (2020)
MathSciNet Google Scholar
Rajasegaran, J., Khan, S., Hayat, M., Khan, F.S., Shah, M.: Self-supervised knowledge distillation for few-shot learning (2021)
Google Scholar
Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: BPR: bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter (2019)
Google Scholar
Singer, U., et al.: Sequential modeling with multiple attributes for watchlist recommendation in e-commerce. In: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pp. 937–946 (2022)
Google Scholar
Song, K., Sun, Q., Xu, C., Zheng, K., Yang, Y.: Self-supervised multi-modal sequential recommendation. arXiv preprint arXiv:2304.13277 (2023)
Su, M., Gu, G., Ren, X., Fu, H., Zhao, Y.: Semi-supervised knowledge distillation for cross-modal hashing. IEEE Trans. Multimedia 25, 28–35 (2021)
Google Scholar
Tan, Y.K., Xu, X., Liu, Y.: Improved recurrent neural networks for session-based recommendations. In: Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, pp. 17–22 (2016)
Google Scholar
Touvron, H., et al.: Llama: open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)
Wei, Y., Wang, X., Nie, L., He, X., Hong, R., Chua, T.S.: Mmgcn: multi-modal graph convolution network for personalized recommendation of micro-video. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1437–1445 (2019)
Google Scholar
Wu, C., Wu, F., Qi, T., Huang, Y.: Empowering news recommendation with pre-trained language models. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1652–1656 (2021)
Google Scholar
Wu, S., Tang, Y., Zhu, Y., Wang, L., Xie, X., Tan, T.: Session-based recommendation with graph neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 346–353 (2019)
Google Scholar
Xia, X., Yin, H., Yu, J., Wang, Q., Xu, G., Nguyen, Q.V.H.: On-device next-item recommendation with self-supervised knowledge distillation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 546–555 (2022)
Google Scholar
Xie, Y., Zhou, P., Kim, S.: Decoupled side information fusion for sequential recommendation. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1611–1621 (2022)
Google Scholar
Yuan, Z., et al.: Where to go next for recommender systems? id-vs. modality-based recommender models revisited. arXiv preprint arXiv:2303.13835 (2023)
Zeng, A., et al.: GLM-130b: an open bilingual pre-trained model. In: Proceedings of the Eleventh International Conference on Learning Representations (ICLR) (2023)
Google Scholar
Zhang, J., Zhu, Y., Liu, Q., Wu, S., Wang, S., Wang, L.: Mining latent structures for multimedia recommendation. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 3872–3880 (2021)
Google Scholar
Zhang, S., Choromanska, A.E., LeCun, Y.: Deep learning with elastic averaging SGD. Adv. Neural. Inf. Process. Syst. 28, 685–693 (2015)
Google Scholar
Zhang, Y., Xu, X., Zhou, H., Zhang, Y.: Distilling structured knowledge into embeddings for explainable and accurate recommendation. In: Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 735–743 (2020)
Google Scholar
Zhao, B., Cui, Q., Song, R., Qiu, Y., Liang, J.: Decoupled knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11953–11962 (2022)
Google Scholar
Zhou, Y., Chen, H., Lin, H., Heng, P.-A.: Deep semi-supervised knowledge distillation for overlapping cervical cell instance segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 521–531. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_51
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

National University of Singapore, Singapore, Singapore
Hengchang Hu, Chuang Li & Min-Yen Kan
The Hong Kong Polytechnic University, Hong Kong, China
Qijiong Liu

Authors

Hengchang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Qijiong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chuang Li
View author publications
You can also search for this author in PubMed Google Scholar
Min-Yen Kan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hengchang Hu .

Editor information

Editors and Affiliations

Georgetown University, Washington, WA, USA
Nazli Goharian
University of Pisa, Pisa, Italy
Nicola Tonellotto
King's College London, London, UK
Yulan He
University College London, London, UK
Aldo Lipani
University of Glasgow, Glasgow, UK
Graham McDonald
University of Glasgow, Glasgow, UK
Craig Macdonald
University of Glasgow, Glasgow, UK
Iadh Ounis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, H., Liu, Q., Li, C., Kan, MY. (2024). Lightweight Modality Adaptation to Sequential Recommendation via Correlation Supervision. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14608. Springer, Cham. https://doi.org/10.1007/978-3-031-56027-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-56027-9_8
Published: 20 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56026-2
Online ISBN: 978-3-031-56027-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Lightweight Modality Adaptation to Sequential Recommendation via Correlation Supervision

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multifactorial modality fusion network for multimodal recommendation

TransRec: Learning Transferable Recommendation from Mixture-of-Modality Feedback

MDAP: A Multi-view Disentangled and Adaptive Preference Learning Framework for Cross-Domain Recommendation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Lightweight Modality Adaptation to Sequential Recommendation via Correlation Supervision

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multifactorial modality fusion network for multimodal recommendation

TransRec: Learning Transferable Recommendation from Mixture-of-Modality Feedback

MDAP: A Multi-view Disentangled and Adaptive Preference Learning Framework for Cross-Domain Recommendation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation