[2210.05668] Understanding Embodied Reference with Touch-Line Transformer