[2307.02460] Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources