[2402.09360] HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM Inference