[2304.10892] Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems