Abstract
In this paper we describe the design, implementation and experimental evaluation of a technique for operating system schedulers called processor pool-based scheduling [51]. Our technique is designed to assign processes (or kernel threads) of parallel applications to processors in multiprogrammed, shared-memory NUMA multiprocessors. The results of the experiments conducted in this research demonstrate that: 1) Pool-based scheduling is an effective method for localizing application execution and reducing mean response times. 2) Although application parallelism should be considered, the optimal pool size is a function of the the system architecture. 3) The strategies of placing new applications in a pool with the largest potential for inpool growth (i.e., the pool containing the fewest jobs) and of isolating applications from each other are desirable properties of algorithms for operating system schedulers executing on NUMA architectures. The “Worst-Fit” policy we examine incorporates both of these properties.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A. Agarwal, D. Chaiken, K. Johnson, D. Kranz, J. Kubiatowicz, K. Kurihara, B. Lim, G. Maa, and D. Nussbaum, “The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor”, Scalable Shared Memory Multiprocessors, ed. M. Dubois and S. S. Thakkar, Kluwer Academic Publishers, Norwell, Massachusetts, pp. 239–261, 1991.
I. Ahmad and A. Ghafoor, “Semi-Distributed Load Balancing for Massively Parallel Multicomputer Systems”, IEEE Transactions on Software Engineering, Vol. 17, No. 10, pp. 987–1004, October, 1991.
F. Bellosa, “Locality-Information-Based Scheduling in Shared-Memory Multiprocessors”, Job Scheduling Strategies for Parallel Processing, ed. D. G. Feitelson and L. Rudolph, Vol. 1162, Springer-Verlag, Lecture Notes in Computer Science, pp. 271–289, April, 1996.
D. L. Black, “Scheduling Support for Concurrency and Parallelism in the Mach Operating System”, IEEE Computer, pp. 35–43, May, 1990.
W. Bolosky, M. Scott, R. Fitzgerald, and A. Cox, “NUMA Policies and their Relationship to Memory Architecture”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 212–221, April, 1991.
T. Brecht, “On the Importance of Parallel Application Placement in NUMA Multiprocessors”, Proceedings of the Fourth Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS IV), San Diego, CA, pp. 1–18, September, 1993.
T. Brecht, “Multiprogrammed Parallel Application Scheduling in NUMA Multiprocessors”, Ph.D. Thesis, University of Toronto, Toronto, Ontario, Technical Report CSRI-303, June, 1994.
T. Brecht and K. Guha, “Using Parallel Program Characteristics in Dynamic Processor Allocation Policies”, Performance Evaluation, Vol. 27 & 28, pp. 519–539, October, 1996.
H. Burkhardt, S. Frank, B. Knobe, and J. Rothnie, “Overview of the KSR1 Computer System”, Kendall Square Research, Boston, Technical Report KSRTR-9202001, February, 1992.
R. Chandra, S. Devine, B. Verghese, A. Gupta, and M. Rosenblum, “Scheduling and Page Migration for Multiprocessor Compute Servers”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, pp. 12–24, October, 1994.
J. Chapin, S. Herrod, M. Rosenblum, and A. Gupta, “Memory System Performance of UNIX on CC-NUMA Multiprocessors”, Proceedings of the 1995 ACM IGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, Ottawa, ON, May, 1995.
J. Chapin, M. Rosenblum, S. Devine, T. Lahiri, D. Teodosiu, and A. Gupta, “Hive: Fault Containment for Shared-Memory Multiprocessors”, Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, pp. 12–25, December, 1995.
Convex, Convex; Exemplar SPP1000/1200 Architecture, Convex Press, 1995.
A. Cox and R. Fowler, “The Implementation of a Coherent Memory Abstraction on a NUMA Multiprocessor: Experiences with Platinum”, Proceedings of the Twelfth ACM Symposium on Operating Systems Principles, pp. 32–43, December, 1989.
D. Feitelson and L. Rudolph, “Evaluation of Design Choices for Gang Scheduling using Distributed Hierarchical Control”, Journal of Parallel and Distributed Computing, Vol. 35, No. 1, pp. 18–34, May, 1996.
D. G. Feitelson and L. Rudolph, “Mapping and Scheduling in a Shared Parallel Environment Using Distributed Hierarchical Control”, 1990 International Conference on Parallel Processing, pp. 11–18, 1990.
D. G. Feitelson and L. Rudolph, “Distributed Hierarchical Control for Parallel Processing”, IEEE Computer, pp. 65–77, May, 1990.
B. Gamsa, “Region-Oriented Main Memory Management in Shared-Memory NUMA Multiprocessors”, M.Sc. Thesis, University of Toronto, Toronto, Ontario, September, 1992.
A. Gupta, A. Tucker, and S. Urushibara, “The Impact of Operating System Scheduling Policies and Synchronization Methods on the Performance of Parallel Applications”, Proceedings of the 1991 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, San Diego, CA, pp. 120–132, May, 1991.
M. Holliday, “Reference History, Page Size, and Migration Daemons in Local/Remote Architectures”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 104–112, April, 1989.
J. Kuskin, D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horowitz, A. Gupta, M. Rosenblum, and J. Hennessy, “The Stanford FLASH Multiprocessor”, Proceedings of the 21st Annual International Symposium on Computer Architecture, pp. 302–313, April, 1994.
R. LaRowe Jr., C. Ellis, and L. Kaplan, “The Robustness of NUMA Memory Management”, Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, Pacific Grove, CA, pp. 137–151, October, 1991.
D. Lenoski, J. Laudon, T. Joe, D. Nakahari, L. Stevens, A. Gupta, and J. Hennessy, “The DASH Prototype: Implementation and Performance”, The Proceedings of the 19th Annual International Symposium on Computer Architecture, pp. 92–103, May, 1992.
S. T. Leutenegger and M. K. Vernon, “The Performance of Multiprogrammed Multiprocessor Scheduling Policies”, Proceedings of the 1990 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Boulder, CO, pp. 226–236, May, 1990.
T. Lovett and R. Clapp, “STiNG: A CC-NUMA Computer System for the Commercial Marketplace”, Proceedings of the 23rd Annual International Symposium on Computer Architecture, pp. 308–317, May, 1996.
E. P. Markatos, “Scheduling for Locality in Shared-Memory Multiprocessors”, Ph.D. Thesis, Department of Computer Science, University of Rochester, Rochester, New York, May, 1993.
E. P Markatos and T. J. LeBlanc, “Load Balancing vs. Locality Management in Shared-Memory Multiprocessors”, 1992 International Conference on Parallel Processing, pp. 258–267, August, 1992.
E. P Markatos and T J. LeBlanc, “Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors”, Proceedings of Supercomputing '92, Minneapolis, MN, pp. 104–113, November, 1992.
C. McCann, R. Vaswani, and J. Zahorjan, “A Dynamic Processor Allocation Policy for Multiprogrammed, Shared Memory Multiprocessors” ACM Transactions on Computer Systems, Vol. 11, No. 2, pp. 146–178, May, 1993.
C. McCann and J. Zahorjan, “Scheduling Memory Constrained Jobs on Distributed Memory Parallel Computers”, Proceedings of the 1995 ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, Ottawa, ON, pp. 208–219, May, 1995.
A. Nowatzyk, G. Aybay, M. Browne, E. Kelly, M. Parkin, B. Radke, and S. Vishin, “The S3.mp Scalable Shared Memory Multiprocessor”, Proceedings of the International Conference on Parallel Processing, 1995.
J.K.Ousterhout,“Scheduling Techniques for Concurrent Systems”, Proceedings of the 3rd International Conference on Distributed Computing Systems, pp. 22–30, October, 1982.
E. Parsons and K. Sevcik, “Coordinated Allocation of Memory and Processors in Multiprocessors”, Proceedings of the 1996 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Philadelphia, PA, pp. 57–67, May, 1996.
V. Peris, M. Squillante, and V. Naik, “Analysis of the Impact of Memory in Distributed Parallel Processing Systems”, Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Nashville, TN, pp. 5–18, May, 1994.
J. Philbin, J. Edler, O. Anshus, C. Douglas, and K. Li, “Thread Scheduling for Cache Locality”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, pp. 60–71, October, 1996.
S. Setia,“The Interaction Between Memory Allocations and Adaptive Partitioning in Message-Passing Multiprocessors”, Job Scheduling Strategies for Parallel Processing, ed. D. G. Feitelson and L. Rudolph, Vol. 949, Springer-Verlag, Lecture Notes in Computer Science, pp. 146–164, April, 1995.
A. Silberschatz and P. Galvin, Operating System Concepts, Addison-Wesley, Reading, Massachusetts, 1994.
J. P. Singh, T. Joe, A. Gupta, and J. Hennessy, “An Empirical Comparison of the Kendall Square Research KSR-1 and Stanford Dash Multiprocessors”, Proceedings of Supercomputing '93, Portland, OR, pp. 214–225, November, 1993.
M. S. Squillante, “Issues in Shared-Memory Multiprocessor Scheduling: A Performance Evaluation”, Ph.D. Thesis, Department of Computer Science and Engineering, University of Washington, Seattle, Washington, Technical Report 90-10-04, October, 1990.
M. S. Squillante and E. D. Lazowska, “Using Processor Cache Affinity Information in Shared-Memory Multiprocessor Scheduling”, IEEE Transactions on Parallel and Distributed Systems, Vol. 4, No. 2, pp. 131–143, February, 1993.
M. Stumm, Z. Vranesic, R. White, R. Unrau, and K. Farkas, “Experiences with the Hector Multiprocessor”, Proceedings of the International Parallel Processing Symposium Parallel Processing Fair, pp. 9–16, April, 1993.
D. Thiebaut and H. S. Stone, “Footprints in the Cache”, ACM Transactions on Computer Systems, Vol. 5, No. 4, pp. 305–329, November, 1987.
A. Tucker and A. Gupta,“Process Control and Scheduling Issues for Multiprogrammed Shared-Memory Multiprocessors”, Proceedings of the Twelfth ACM Symposium on Operating Systems Principles, pp. 159–166, December, 1989.
R. Unrau, “Scalable Memory Management through Hierarchical Symmetric Multiprocessing”, Ph.D. Thesis, University of Toronto, Toronto, Ontario, January, 1993.
R. Unrau, M. Stumm, and O. Krieger, “Hierarchical Clustering: A Structure for Scalable Multiprocessor Operating System Design”, Proceedings of the USENIX Workshop on Micro-Kernels and Other Kernel Architectures, Seattle, WA, pp. 285–303, April, 1992.
R. Vaswani and J. Zahorjan, “The Implications of Cache Affinity on Processor Scheduling for Multiprogrammed, Shared Memory Multiprocessors”, Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, Pacific Grove, CA, pp. 26–40, October, 1991.
B. Verghese, S. Devine, A. Gupta, and M. Rosenblum, “Operating System Support for Improving Data Locality on CC-NUMA Compute Servers”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, pp. 279–289, October, 1996.
Z. Vranesic, M. Stumm, D. Lewis, and R. White, “Hector: A Hierarchically Structured Shared-Memory Multiprocessor”, IEEE Computer, Vol. 24, No. 1, pp. 72–79, January, 1991.
S. Woo, M. Ohara, E. Torrie, J.P. Singh, and A. Gupta, “The SPLASH-2 Programs: Characterization and Methodological Considerations”, Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp. 24–36, 1995.
J. Zahorjan and C. McCann, “Processor Scheduling in Shared Memory Multiprocessors”, Proceedings of the 1990 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Boulder, CO, pp. 214–225, May, 1990.
S. Zhou and T. Brecht, “Processor Pool-Based Scheduling for Large-Scale NUMA Multiprocessors”, Proceedings of the 1991 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, San Diego, CA, pp. 133–142, May, 1991.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brecht, T.B. (1997). An experimental evaluation of processor pool-based scheduling for shared-memory NUMA multiprocessors. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1997. Lecture Notes in Computer Science, vol 1291. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63574-2_20
Download citation
DOI: https://doi.org/10.1007/3-540-63574-2_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63574-1
Online ISBN: 978-3-540-69599-8
eBook Packages: Springer Book Archive