[2310.02110] Sieve: Multimodal Dataset Pruning Using Image Captioning Models