[2308.11940] Audio Generation with Multiple Conditional Diffusion Model