[2404.09526] LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence Parallelism