On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $\epsilon$-Greedy Exploration

Zhang, Shuai; Li, Hongkang; Wang, Meng; Liu, Miao; Chen, Pin-Yu; Lu, Songtao; Liu, Sijia; Murugesan, Keerthiram; Chaudhury, Subhajit

Computer Science > Machine Learning

arXiv:2310.16173 (cs)

[Submitted on 24 Oct 2023]

Title:On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $ε$-Greedy Exploration

Authors:Shuai Zhang, Hongkang Li, Meng Wang, Miao Liu, Pin-Yu Chen, Songtao Lu, Sijia Liu, Keerthiram Murugesan, Subhajit Chaudhury

View PDF

Abstract:This paper provides a theoretical understanding of Deep Q-Network (DQN) with the $\varepsilon$-greedy exploration in deep reinforcement learning. Despite the tremendous empirical achievement of the DQN, its theoretical characterization remains underexplored. First, the exploration strategy is either impractical or ignored in the existing analysis. Second, in contrast to conventional Q-learning algorithms, the DQN employs the target network and experience replay to acquire an unbiased estimation of the mean-square Bellman error (MSBE) utilized in training the Q-network. However, the existing theoretical analysis of DQNs lacks convergence analysis or bypasses the technical challenges by deploying a significantly overparameterized neural network, which is not computationally efficient. This paper provides the first theoretical convergence and sample complexity analysis of the practical setting of DQNs with $\epsilon$-greedy policy. We prove an iterative procedure with decaying $\epsilon$ converges to the optimal Q-value function geometrically. Moreover, a higher level of $\epsilon$ values enlarges the region of convergence but slows down the convergence, while the opposite holds for a lower level of $\epsilon$ values. Experiments justify our established theoretical insights on DQNs.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2310.16173 [cs.LG]
	(or arXiv:2310.16173v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.16173
Journal reference:	Neurips 2023

Submission history

From: Hongkang Li [view email]
[v1] Tue, 24 Oct 2023 20:37:02 UTC (820 KB)

Computer Science > Machine Learning

Title:On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $ε$-Greedy Exploration

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $ε$-Greedy Exploration

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators