Abstract
We consider multi-class multi-server queuing systems where a subset of servers, called a server pool, may collaborate in serving jobs of a given class. The pools of servers associated with different classes may overlap, so the sharing of server resources across classes is done via a dynamic allocation policy based on a fairness criterion. We consider an asymptotic regime where the total load increases proportionally with the system size. We show that under limited scaling in size of server pools the stationary distribution for activity of a fixed finite subset of servers has asymptotically a product form, which in turn implies a concentration result for server activity. In particular, we establish a clear connection between the scaling of server pools’ size and asymptotic independence. Further, these results are robust to the service requirement distribution of jobs. For large-scale cloud systems where heterogeneous pools of servers collaborate in serving jobs of diverse classes, a concentration in server activity indicates that the overall power and network capacity that need to be provisioned may be substantially lower than the worst case, thus reducing costs.
Similar content being viewed by others
Notes
Our model may be generalized in the following ways without affecting our results: (1) The service requirement distribution may be different for each class as long as the mean service requirement is same for each class. (2) The arrival rate and mean service requirement may be different for each class as long as their product (which equals \(\rho _i^{(m)}\)) is same for each class.
References
Aldous, D.J.: Exchangeability and Related Topics. Springer, Berlin (1985)
Barroso, L., Hölzle, U.: The datacenter as a computer: an introduction to the design of warehouse-scale machines. Synth. Lect. Comput. Archit. 4(1), 1–108 (2009)
Bonald, T., Massoulié, L., Proutière, A., Virtamo, J.: A queueing analysis of max-min fairness, proportional fairness and balanced fairness. Queueing Syst. 53, 65–84 (2006)
Bonald, T., Proutière, A.: Insensitive bandwidth sharing in data networks. Queueing Syst. 44, 69–100 (2003)
Bonald, T., Proutière, A.: On performance bounds for the integration of elastic and adaptive streaming flows. In: Proceedings of ACM Sigmetrics, pp. 235–245 (2004)
Bonald, T., Proutière, A., Roberts, J., Virtamo, J.: Computational aspects of balanced fairness. In: Proceedings of ITC (2003)
Bramson, M., Lu, Y., Prabhakar, B.: Asymptotic independence of queues under randomized load balancing. Queueing Syst. 71(3), 247–292 (2012)
Edmonds, J.: Submodular functions, matroids, and certain polyhedra. In: Proceedings of Calgary International Conference on Combinatorial Structures and Applications, pp. 69–87 (1969)
Kallenberg, O.: Canonical representations and convergence criteria for processes with interchangeable increments. Zeitschrift fr Wahrscheinlichkeitstheorie und Verwandte Gebiete 27(1), 23–36 (1973)
Kelly, F.P.: Reversibility and Stochastic Networks. Wiley, New York (1979)
Kelly, F.P.: Loss networks. Ann. Appl. Probab. 1(3), 319–378 (1991)
Lin, M., Wierman, A., Andrew, L., Thereska, E.: Dynamic right-sizing for power-proportional data centers. In: Proceedings of IEEE Infocom, pp. 1098–1106 (2011)
Massoulié, L.: Structural properties of proportional fairness: stability and insensitivity. Ann. Appl. Probab. 17(3), 809–839 (2007)
Mitzenmacher, M.D.: The power of two choices in randomized load balancing. Ph.D. thesis, University of California, Berkeley (1996)
Nemhauser, G.L., Wolsey, L.A.: Integer and Combinatorial Optimization, vol. 18. Wiley, New York (1988)
Shah, V., de Veciana, G.: Performance evaluation and asymptotics for content delivery networks. In: IEEE Infocom, pp. 2607–2615 (2014)
Shah, V., de Veciana, G.: Impact of fairness and heterogeneity on delays in large-scale content delivery networks. In: ACM Sigmetrics (2015)
Vvedenskaya, N.D., Dobrushin, R.L., Karpelevich, F.I.: Queueing system with selection of the shortest of two queues: an asymptotic approach. Probl. Peredachi Informatsii 32(1), 20–34 (1996)
Whitt, W.: Blocking when service is required from several facilities simultaneously. AT&T Tech. J. 64(8), 1807–1856 (1985)
Acknowledgments
The authors would like to thank François Baccelli and Sanjay Shakkottai at The University of Texas at Austin, and Xiaoqing Zhu at Cisco Systems for helpful discussions which motivated this work. Virag Shah would also like to thank Anurag Kumar at Indian Institute of Science for providing him with the first exposure to mean field models.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shah, V., de Veciana, G. Asymptotic independence of servers’ activity in queueing systems with limited resource pooling. Queueing Syst 83, 13–28 (2016). https://doi.org/10.1007/s11134-016-9475-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11134-016-9475-0