[2204.03162] Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality