An experimental evaluation of processor pool-based scheduling for shared-memory NUMA multiprocessors | SpringerLink
Skip to main content

An experimental evaluation of processor pool-based scheduling for shared-memory NUMA multiprocessors

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1291))

Included in the following conference series:

Abstract

In this paper we describe the design, implementation and experimental evaluation of a technique for operating system schedulers called processor pool-based scheduling [51]. Our technique is designed to assign processes (or kernel threads) of parallel applications to processors in multiprogrammed, shared-memory NUMA multiprocessors. The results of the experiments conducted in this research demonstrate that: 1) Pool-based scheduling is an effective method for localizing application execution and reducing mean response times. 2) Although application parallelism should be considered, the optimal pool size is a function of the the system architecture. 3) The strategies of placing new applications in a pool with the largest potential for inpool growth (i.e., the pool containing the fewest jobs) and of isolating applications from each other are desirable properties of algorithms for operating system schedulers executing on NUMA architectures. The “Worst-Fit” policy we examine incorporates both of these properties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. A. Agarwal, D. Chaiken, K. Johnson, D. Kranz, J. Kubiatowicz, K. Kurihara, B. Lim, G. Maa, and D. Nussbaum, “The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor”, Scalable Shared Memory Multiprocessors, ed. M. Dubois and S. S. Thakkar, Kluwer Academic Publishers, Norwell, Massachusetts, pp. 239–261, 1991.

    Google Scholar 

  2. I. Ahmad and A. Ghafoor, “Semi-Distributed Load Balancing for Massively Parallel Multicomputer Systems”, IEEE Transactions on Software Engineering, Vol. 17, No. 10, pp. 987–1004, October, 1991.

    Article  Google Scholar 

  3. F. Bellosa, “Locality-Information-Based Scheduling in Shared-Memory Multiprocessors”, Job Scheduling Strategies for Parallel Processing, ed. D. G. Feitelson and L. Rudolph, Vol. 1162, Springer-Verlag, Lecture Notes in Computer Science, pp. 271–289, April, 1996.

    Google Scholar 

  4. D. L. Black, “Scheduling Support for Concurrency and Parallelism in the Mach Operating System”, IEEE Computer, pp. 35–43, May, 1990.

    Google Scholar 

  5. W. Bolosky, M. Scott, R. Fitzgerald, and A. Cox, “NUMA Policies and their Relationship to Memory Architecture”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 212–221, April, 1991.

    Google Scholar 

  6. T. Brecht, “On the Importance of Parallel Application Placement in NUMA Multiprocessors”, Proceedings of the Fourth Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS IV), San Diego, CA, pp. 1–18, September, 1993.

    Google Scholar 

  7. T. Brecht, “Multiprogrammed Parallel Application Scheduling in NUMA Multiprocessors”, Ph.D. Thesis, University of Toronto, Toronto, Ontario, Technical Report CSRI-303, June, 1994.

    Google Scholar 

  8. T. Brecht and K. Guha, “Using Parallel Program Characteristics in Dynamic Processor Allocation Policies”, Performance Evaluation, Vol. 27 & 28, pp. 519–539, October, 1996.

    Google Scholar 

  9. H. Burkhardt, S. Frank, B. Knobe, and J. Rothnie, “Overview of the KSR1 Computer System”, Kendall Square Research, Boston, Technical Report KSRTR-9202001, February, 1992.

    Google Scholar 

  10. R. Chandra, S. Devine, B. Verghese, A. Gupta, and M. Rosenblum, “Scheduling and Page Migration for Multiprocessor Compute Servers”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, pp. 12–24, October, 1994.

    Google Scholar 

  11. J. Chapin, S. Herrod, M. Rosenblum, and A. Gupta, “Memory System Performance of UNIX on CC-NUMA Multiprocessors”, Proceedings of the 1995 ACM IGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, Ottawa, ON, May, 1995.

    Google Scholar 

  12. J. Chapin, M. Rosenblum, S. Devine, T. Lahiri, D. Teodosiu, and A. Gupta, “Hive: Fault Containment for Shared-Memory Multiprocessors”, Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, pp. 12–25, December, 1995.

    Google Scholar 

  13. Convex, Convex; Exemplar SPP1000/1200 Architecture, Convex Press, 1995.

    Google Scholar 

  14. A. Cox and R. Fowler, “The Implementation of a Coherent Memory Abstraction on a NUMA Multiprocessor: Experiences with Platinum”, Proceedings of the Twelfth ACM Symposium on Operating Systems Principles, pp. 32–43, December, 1989.

    Google Scholar 

  15. D. Feitelson and L. Rudolph, “Evaluation of Design Choices for Gang Scheduling using Distributed Hierarchical Control”, Journal of Parallel and Distributed Computing, Vol. 35, No. 1, pp. 18–34, May, 1996.

    Article  Google Scholar 

  16. D. G. Feitelson and L. Rudolph, “Mapping and Scheduling in a Shared Parallel Environment Using Distributed Hierarchical Control”, 1990 International Conference on Parallel Processing, pp. 11–18, 1990.

    Google Scholar 

  17. D. G. Feitelson and L. Rudolph, “Distributed Hierarchical Control for Parallel Processing”, IEEE Computer, pp. 65–77, May, 1990.

    Google Scholar 

  18. B. Gamsa, “Region-Oriented Main Memory Management in Shared-Memory NUMA Multiprocessors”, M.Sc. Thesis, University of Toronto, Toronto, Ontario, September, 1992.

    Google Scholar 

  19. A. Gupta, A. Tucker, and S. Urushibara, “The Impact of Operating System Scheduling Policies and Synchronization Methods on the Performance of Parallel Applications”, Proceedings of the 1991 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, San Diego, CA, pp. 120–132, May, 1991.

    Google Scholar 

  20. M. Holliday, “Reference History, Page Size, and Migration Daemons in Local/Remote Architectures”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 104–112, April, 1989.

    Google Scholar 

  21. J. Kuskin, D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horowitz, A. Gupta, M. Rosenblum, and J. Hennessy, “The Stanford FLASH Multiprocessor”, Proceedings of the 21st Annual International Symposium on Computer Architecture, pp. 302–313, April, 1994.

    Google Scholar 

  22. R. LaRowe Jr., C. Ellis, and L. Kaplan, “The Robustness of NUMA Memory Management”, Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, Pacific Grove, CA, pp. 137–151, October, 1991.

    Google Scholar 

  23. D. Lenoski, J. Laudon, T. Joe, D. Nakahari, L. Stevens, A. Gupta, and J. Hennessy, “The DASH Prototype: Implementation and Performance”, The Proceedings of the 19th Annual International Symposium on Computer Architecture, pp. 92–103, May, 1992.

    Google Scholar 

  24. S. T. Leutenegger and M. K. Vernon, “The Performance of Multiprogrammed Multiprocessor Scheduling Policies”, Proceedings of the 1990 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Boulder, CO, pp. 226–236, May, 1990.

    Google Scholar 

  25. T. Lovett and R. Clapp, “STiNG: A CC-NUMA Computer System for the Commercial Marketplace”, Proceedings of the 23rd Annual International Symposium on Computer Architecture, pp. 308–317, May, 1996.

    Google Scholar 

  26. E. P. Markatos, “Scheduling for Locality in Shared-Memory Multiprocessors”, Ph.D. Thesis, Department of Computer Science, University of Rochester, Rochester, New York, May, 1993.

    Google Scholar 

  27. E. P Markatos and T. J. LeBlanc, “Load Balancing vs. Locality Management in Shared-Memory Multiprocessors”, 1992 International Conference on Parallel Processing, pp. 258–267, August, 1992.

    Google Scholar 

  28. E. P Markatos and T J. LeBlanc, “Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors”, Proceedings of Supercomputing '92, Minneapolis, MN, pp. 104–113, November, 1992.

    Google Scholar 

  29. C. McCann, R. Vaswani, and J. Zahorjan, “A Dynamic Processor Allocation Policy for Multiprogrammed, Shared Memory Multiprocessors” ACM Transactions on Computer Systems, Vol. 11, No. 2, pp. 146–178, May, 1993.

    Article  Google Scholar 

  30. C. McCann and J. Zahorjan, “Scheduling Memory Constrained Jobs on Distributed Memory Parallel Computers”, Proceedings of the 1995 ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems, Ottawa, ON, pp. 208–219, May, 1995.

    Google Scholar 

  31. A. Nowatzyk, G. Aybay, M. Browne, E. Kelly, M. Parkin, B. Radke, and S. Vishin, “The S3.mp Scalable Shared Memory Multiprocessor”, Proceedings of the International Conference on Parallel Processing, 1995.

    Google Scholar 

  32. J.K.Ousterhout,“Scheduling Techniques for Concurrent Systems”, Proceedings of the 3rd International Conference on Distributed Computing Systems, pp. 22–30, October, 1982.

    Google Scholar 

  33. E. Parsons and K. Sevcik, “Coordinated Allocation of Memory and Processors in Multiprocessors”, Proceedings of the 1996 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Philadelphia, PA, pp. 57–67, May, 1996.

    Google Scholar 

  34. V. Peris, M. Squillante, and V. Naik, “Analysis of the Impact of Memory in Distributed Parallel Processing Systems”, Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Nashville, TN, pp. 5–18, May, 1994.

    Google Scholar 

  35. J. Philbin, J. Edler, O. Anshus, C. Douglas, and K. Li, “Thread Scheduling for Cache Locality”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, pp. 60–71, October, 1996.

    Google Scholar 

  36. S. Setia,“The Interaction Between Memory Allocations and Adaptive Partitioning in Message-Passing Multiprocessors”, Job Scheduling Strategies for Parallel Processing, ed. D. G. Feitelson and L. Rudolph, Vol. 949, Springer-Verlag, Lecture Notes in Computer Science, pp. 146–164, April, 1995.

    Google Scholar 

  37. A. Silberschatz and P. Galvin, Operating System Concepts, Addison-Wesley, Reading, Massachusetts, 1994.

    Google Scholar 

  38. J. P. Singh, T. Joe, A. Gupta, and J. Hennessy, “An Empirical Comparison of the Kendall Square Research KSR-1 and Stanford Dash Multiprocessors”, Proceedings of Supercomputing '93, Portland, OR, pp. 214–225, November, 1993.

    Google Scholar 

  39. M. S. Squillante, “Issues in Shared-Memory Multiprocessor Scheduling: A Performance Evaluation”, Ph.D. Thesis, Department of Computer Science and Engineering, University of Washington, Seattle, Washington, Technical Report 90-10-04, October, 1990.

    Google Scholar 

  40. M. S. Squillante and E. D. Lazowska, “Using Processor Cache Affinity Information in Shared-Memory Multiprocessor Scheduling”, IEEE Transactions on Parallel and Distributed Systems, Vol. 4, No. 2, pp. 131–143, February, 1993.

    Article  Google Scholar 

  41. M. Stumm, Z. Vranesic, R. White, R. Unrau, and K. Farkas, “Experiences with the Hector Multiprocessor”, Proceedings of the International Parallel Processing Symposium Parallel Processing Fair, pp. 9–16, April, 1993.

    Google Scholar 

  42. D. Thiebaut and H. S. Stone, “Footprints in the Cache”, ACM Transactions on Computer Systems, Vol. 5, No. 4, pp. 305–329, November, 1987.

    Article  Google Scholar 

  43. A. Tucker and A. Gupta,“Process Control and Scheduling Issues for Multiprogrammed Shared-Memory Multiprocessors”, Proceedings of the Twelfth ACM Symposium on Operating Systems Principles, pp. 159–166, December, 1989.

    Google Scholar 

  44. R. Unrau, “Scalable Memory Management through Hierarchical Symmetric Multiprocessing”, Ph.D. Thesis, University of Toronto, Toronto, Ontario, January, 1993.

    Google Scholar 

  45. R. Unrau, M. Stumm, and O. Krieger, “Hierarchical Clustering: A Structure for Scalable Multiprocessor Operating System Design”, Proceedings of the USENIX Workshop on Micro-Kernels and Other Kernel Architectures, Seattle, WA, pp. 285–303, April, 1992.

    Google Scholar 

  46. R. Vaswani and J. Zahorjan, “The Implications of Cache Affinity on Processor Scheduling for Multiprogrammed, Shared Memory Multiprocessors”, Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, Pacific Grove, CA, pp. 26–40, October, 1991.

    Google Scholar 

  47. B. Verghese, S. Devine, A. Gupta, and M. Rosenblum, “Operating System Support for Improving Data Locality on CC-NUMA Compute Servers”, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, pp. 279–289, October, 1996.

    Google Scholar 

  48. Z. Vranesic, M. Stumm, D. Lewis, and R. White, “Hector: A Hierarchically Structured Shared-Memory Multiprocessor”, IEEE Computer, Vol. 24, No. 1, pp. 72–79, January, 1991.

    Google Scholar 

  49. S. Woo, M. Ohara, E. Torrie, J.P. Singh, and A. Gupta, “The SPLASH-2 Programs: Characterization and Methodological Considerations”, Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp. 24–36, 1995.

    Google Scholar 

  50. J. Zahorjan and C. McCann, “Processor Scheduling in Shared Memory Multiprocessors”, Proceedings of the 1990 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Boulder, CO, pp. 214–225, May, 1990.

    Google Scholar 

  51. S. Zhou and T. Brecht, “Processor Pool-Based Scheduling for Large-Scale NUMA Multiprocessors”, Proceedings of the 1991 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, San Diego, CA, pp. 133–142, May, 1991.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Dror G. Feitelson Larry Rudolph

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brecht, T.B. (1997). An experimental evaluation of processor pool-based scheduling for shared-memory NUMA multiprocessors. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1997. Lecture Notes in Computer Science, vol 1291. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63574-2_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-63574-2_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63574-1

  • Online ISBN: 978-3-540-69599-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics