[2305.05208] Boosting Visual-Language Models by Exploiting Hard Samples