[2204.13867] Vision-Language Pre-Training for Boosting Scene Text Detectors