[2311.02849] Co-training and Co-distillation for Quality Improvement and Compression of Language Models