[2206.05836] GLIPv2: Unifying Localization and Vision-Language Understanding