[2312.12423] Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model