[2211.06774] Large-Scale Bidirectional Training for Zero-Shot Image Captioning