[2311.08914] Efficiently Escaping Saddle Points for Non-Convex Policy Optimization