Alleviating Matthew Effect of Offline Reinforcement Learning in Interactive Recommendation

Gao, Chongming; Huang, Kexin; Chen, Jiawei; Zhang, Yuan; Li, Biao; Jiang, Peng; Wang, Shiqi; Zhang, Zhong; He, Xiangnan

doi:10.1145/3539618.3591636

Computer Science > Information Retrieval

arXiv:2307.04571 (cs)

[Submitted on 10 Jul 2023]

Title:Alleviating Matthew Effect of Offline Reinforcement Learning in Interactive Recommendation

Authors:Chongming Gao, Kexin Huang, Jiawei Chen, Yuan Zhang, Biao Li, Peng Jiang, Shiqi Wang, Zhong Zhang, Xiangnan He

View PDF

Abstract:Offline reinforcement learning (RL), a technology that offline learns a policy from logged data without the need to interact with online environments, has become a favorable choice in decision-making processes like interactive recommendation. Offline RL faces the value overestimation problem. To address it, existing methods employ conservatism, e.g., by constraining the learned policy to be close to behavior policies or punishing the rarely visited state-action pairs. However, when applying such offline RL to recommendation, it will cause a severe Matthew effect, i.e., the rich get richer and the poor get poorer, by promoting popular items or categories while suppressing the less popular ones. It is a notorious issue that needs to be addressed in practical recommender systems.
In this paper, we aim to alleviate the Matthew effect in offline RL-based recommendation. Through theoretical analyses, we find that the conservatism of existing methods fails in pursuing users' long-term satisfaction. It inspires us to add a penalty term to relax the pessimism on states with high entropy of the logging policy and indirectly penalizes actions leading to less diverse states. This leads to the main technical contribution of the work: Debiased model-based Offline RL (DORL) method. Experiments show that DORL not only captures user interests well but also alleviates the Matthew effect. The implementation is available via this https URL.

Comments:	SIGIR 2023 Full Paper
Subjects:	Information Retrieval (cs.IR)
Cite as:	arXiv:2307.04571 [cs.IR]
	(or arXiv:2307.04571v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2307.04571
Related DOI:	https://doi.org/10.1145/3539618.3591636

Submission history

From: Chongming Gao [view email]
[v1] Mon, 10 Jul 2023 14:03:34 UTC (603 KB)

Computer Science > Information Retrieval

Title:Alleviating Matthew Effect of Offline Reinforcement Learning in Interactive Recommendation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Alleviating Matthew Effect of Offline Reinforcement Learning in Interactive Recommendation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators