[2306.00245] From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces