[2410.21220] Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines