Abstract
Jobs that run on parallel systems that use gang scheduling for multiprogramming may interact with each other in various ways. These interactions are affected by system parameters such as the level of multiprogramming and the scheduling time quantum. A careful evaluation is therefore required in order to find parameter values that lead to optimal performance. We perform a detailed performance evaluation of three factors affecting scheduling systems running dynamic workloads: multiprogramming level, time quantum, and the use of backfilling for queue management — and how they depend on offered load. Our evaluation is based on synthetic MPI applications running on a real cluster that actually implements the various scheduling schemes. Our results demonstrate the importance of both components of the gang-scheduling plus backfilling combination: gang scheduling reduces response time and slowdown, and backfilling allows doing so with a limited multiprogramming level. This is further improved by using flexible coscheduling rather than strict gang scheduling, as this reduces the constraints and allows for a denser packing.
This work was supported by the U.S. Department of Energy through Los Alamos National Laboratory contract W-7405-ENG-36, and by the Israel Science Foundation (grant no. 219/99).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arpaci-Dusseau, A.C.: Implicit Coscheduling: Coordinated Scheduling with Implicit Information in Distributed Systems. ACM Transactions on Computer Systems 19(3), 283–331 (2001)
Batat, A., Feitelson, D.G.: Gang Scheduling with Memory Considerations. In: International Parallel and Distributed Processing Symposium, May 2000, vol. 14, pp. 109–114 (2000)
Etsion, Y., Tsafrir, D., Feitelson, D.G.: Effects of Clock Resolution on the Scheduling of Interactive and Soft Real-Time Processes. In: SIGMETRICS Conf. Measurement and Modeling of Comput. Syst. (June 2003) (to appear)
Feitelson, D.G.: A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T. J. Watson Research Center (October 1994)
Feitelson, D.G.: The Forgotten Factor: Facts; on Performance Evaluation and Its Dependence on Workloads. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 49–60. Springer, Heidelberg (2002)
Feitelson, D.G., Rudolph, L.: Gang Scheduling Performance Benefits for Fine-Grain Synchronization. Journal of Parallel and Distributed Computing 16(4), 306–318 (1992)
Feitelson, D.G., Rudolph, L.: Metrics and Benchmarking for Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1998, SPDP-WS 1998, and JSSPP 1998. LNCS, vol. 1459, pp. 1–24. Springer, Heidelberg (1998)
Feitelson, D.G., Rudolph, L., Schwiegelshohn, U., Sevcik, K.C., Wong, P.: Theory and Practice in Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997)
Frachtenberg, E., Feitelson, D.G., Petrini, F., Fernandez, J.: Flexible CoScheduling: Mitigating load imbalance and improving utilization of heterogeneous resources. In: International Parallel and Distributed Processing Symposium (April 2003, vol. 17 (2003)
Frachtenberg, E., Petrini, F., Fernandez, J., Pakin, S., Coll, S.: STORM: Lightning-Fast Resource Management. In: Supercomputing 2002, Baltimore, MD (November 2002)
Gupta, A., Tucker, A., Urushibara, S.: The Impact of Operating System Scheduling Policies and Synchronization Methods on the Performance of Parallel Applications. In: SIGMETRICS Conf. Measurement and Modeling of Comput. Syst., May 1991, pp. 120–132 (1991)
Lifka, D.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)
Lublin, U., Feitelson, D.G.: The Workload on Parallel Supercomputers: Modeling the Characteristics of Rigid Jobs. Journal of Parallel and Distributed Computing (2003) (to appear)
Moreira, J.E., Chan, W., Fong, L.L., Franke, H., Jette, M.A.: An Infrastructure for Efficient Parallel Job Execution in Terascale Computing Environments. In: Supercomputing 1998 (November 1998)
Mualem, A.W., Feitelson, D.G.: Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Transactions on Parallel and Distributed Systems 12(6), 529–543 (2001)
Ousterhout, J.K.: Scheduling Techniques for Concurrent Systems. In: 3rd Intl. Conf. Distributed Comput. Syst. (ICDCS), October 1982, pp. 22–30 (1982)
Petrini, F., Feng, W.c., Hoisie, A., Coll, S., Frachtenberg, E.: The Quadrics Network: High Performance Clustering Technology. IEEE Micro 22(l), 46–57 (2002)
Quadrics Supercomputers World Ltd. Elan Reference Manual (January 1999)
Quadrics Supercomputers World Ltd. Elan Programming Manual (May 2002)
Talby, D., Feitelson, D.G., Raveh, A.: Comparing Logs and Models ofParallel Workloads Using the Co-Plot Method. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, pp. 43–66. Springer, Heidelberg (1999)
Valiant, L.G.: A Bridging Model for Parallel Computation. Communications of the ACM 33(8), 103–111 (1990)
Zhang, Y., Franke, H., Moreira, J.E., Sivasubramaniam, A.: Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques. In: Intl. Parallel & Distributed Processing Symp., May 2000, vol. 14, pp. 133–142 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Frachtenberg, E., Feitelson, D.G., Fernandez, J., Petrini, F. (2003). Parallel Job Scheduling under Dynamic Workloads. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2003. Lecture Notes in Computer Science, vol 2862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10968987_11
Download citation
DOI: https://doi.org/10.1007/10968987_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20405-3
Online ISBN: 978-3-540-39727-4
eBook Packages: Springer Book Archive