[2403.19546] Croissant: A Metadata Format for ML-Ready Datasets