[2210.09409] Sufficient Exploration for Convex Q-learning