[2306.06622] Weakly Supervised Visual Question Answer Generation