[2405.05417] Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models