[2211.03256] On Web-based Visual Corpus Construction for Visual Document Understanding