[2409.04792] Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn