[2212.03363] Few-Shot Preference Learning for Human-in-the-Loop RL