A Cluster Computer Performance Predictor for Memory Scheduling | SpringerLink
Skip to main content

A Cluster Computer Performance Predictor for Memory Scheduling

  • Conference paper
Algorithms and Architectures for Parallel Processing (ICA3PP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7017))

  • 1249 Accesses

Abstract

Remote Memory Access (RMA) hardware allow a given motherboard in a cluster to directly access the memory installed in a remote motherboard of the same cluster. In recent works, this characteristic has been used to extend the addressable memory space of selected motherboards, which enable a better balance of main memory resources among cluster applications. This way is much more cost-effective than than implementing a full-fledged shared memory system.

In this context, the memory scheduler is in charge of finding a suitable distribution of local and remote memory that maximizes the performance and guarantees a minimum QoS among the applications. Note that since changing the memory distribution is a slow process involving several motherboards, the memory scheduler needs to make sure that the target distribution provides better performance than the current one.

In this paper, a performance predictor is designed in order to find the best memory distribution for a given set of applications executing in a cluster motherboard. The predictor uses simple hardware counters to estimate the expected impact on performance of the different memory distributions. The hardware counters provide the predictor with the information about the time spent in processor, memory access and network.

The performance model used by the predictor has been validated in a detailed microarchitectural simulator using real benchmarks. Results show that the prediction accuracy never deviates more than 5% compared to the real results, being less than 0.5% in most of the cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Meuer, H.W.: The top500 project: Looking back over 15 years of supercomputing experience. Informatik-Spektrum 31, 203–222 (2008), doi:10.1007/s00287-008-0240-6

    Article  Google Scholar 

  2. Nussle, M., Scherer, M., Bruning, U.: A Resource Optimized Remote-Memory-Access Architecture for Low-latency Communication. In: International Conference on Parallel Processing, pp. 220–227 (September 2009)

    Google Scholar 

  3. Blocksome, M., Archer, C., Inglett, T., McCarthy, P., Mundy, M., Ratterman, J., Sidelnik, A., Smith, B., Almási, G., Castaños, J., Lieber, D., Moreira, J., Krishnamoorthy, S., Tipparaju, V., Nieplocha, J.: Design and implementation of a one-sided communication interface for the IBM eServer Blue Gene®supercomputer. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p. 120. ACM, New York (2006)

    Google Scholar 

  4. Kumar, S., Dózsa, G., Almasi, G., Heidelberger, P., Chen, D., Giampapa, M., Blocksome, M., Faraj, A., Parker, J., Ratterman, J., Smith, B.E., Archer, C.: The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer. In: ICS, pp. 94–103 (2008)

    Google Scholar 

  5. Tipparaju, V., Kot, A., Nieplocha, J., Bruggencate, M.T., Chrisochoides, N.: Evaluation of Remote Memory Access Communication on the Cray XT3. In: IEEE International Parallel and Distributed Processing Symposium, pp. 1–7 (March 2007)

    Google Scholar 

  6. HyperTransport Technology Consortium. HyperTransport I/O Link Specification Revision (October 3, 2008)

    Google Scholar 

  7. Serrano, M., Sahuquillo, J., Hassan, H., Petit, S., Duato, J.: A scheduling heuristic to handle local and remote memory in cluster computers. In: High Performance Computing and Communications (2010) (accepted for publication)

    Google Scholar 

  8. Serrano, M., Sahuquillo, J., Petit, S., Hassan, H., Duato, J.: A cost-effective heuristic to schedule local and remote memory in cluster computers. The Journal of Supercomputing, 1–19 (2011), doi:10.1007/s11227-011-0566-8

    Google Scholar 

  9. Ubal, R., Sahuquillo, J., Petit, S., López, P.: Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors. In: Proceedings of the 19th International Symposium on Computer Architecture and High Performance Computing (2007)

    Google Scholar 

  10. Keltcher, C.N., McGrath, K.J., Ahmed, A., Conway, P.: The AMD Opteron Processor for Multiprocessor Servers. IEEE Micro 23(2), 66–76 (2003)

    Article  Google Scholar 

  11. Duato, J., Silla, F., Yalamanchili, S.: Extending HyperTransport Protocol for Improved Scalability. In: First International Workshop on HyperTransport Research and Applications (2009)

    Google Scholar 

  12. Litz, H., Fröening, H., Nuessle, M., Brüening, U.: A HyperTransport Network Interface Controller for Ultra-low Latency Message Transfers. In: HyperTransport Consortium White Paper (2007)

    Google Scholar 

  13. Zhuravlev, S., Blagodurov, S., Fedorova, A.: Addressing shared resource contention in multicore processors via scheduling. In: Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 129–142 (2010)

    Google Scholar 

  14. Xie, Y., Loh, G.H.: Dynamic Classification of Program Memory Behaviors in CMPs. In: 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects in conjunction with the 35th International Symposium on Computer Architecture (2008)

    Google Scholar 

  15. Xu, C., Chen, X., Dick, R.P., Mao, Z.M.: Cache contention and application performance prediction for multi-core systems. In: IEEE International Symposium on Performance Analysis of Systems and Software, pp. 76–86 (2010)

    Google Scholar 

  16. Rai, J.K., Negi, A., Wankar, R., Nayak, K.D.: Performance prediction on multi-core processors. In: 2010 International Conference on Computational Intelligence and Communication Networks (CICN), pp. 633–637 (November 2010)

    Google Scholar 

  17. Liang, S., Noronha, R., Panda, D.K.: Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device. In: CLUSTER, pp. 1–10. IEEE, Los Alamitos (2005)

    Google Scholar 

  18. Werstein, P., Jia, X., Huang, Z.: A Remote Memory Swapping System for Cluster Computers. In: Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 75–81 (2007)

    Google Scholar 

  19. Midorikawa, H., Kurokawa, M., Himeno, R., Sato, M.: DLM: A distributed Large Memory System using remote memory swapping over cluster nodes. In: IEEE International Conference on Cluster Computing, pp. 268–273 (October 2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Serrano, M., Sahuquillo, J., Hassan, H., Petit, S., Duato, J. (2011). A Cluster Computer Performance Predictor for Memory Scheduling. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2011. Lecture Notes in Computer Science, vol 7017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24669-2_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24669-2_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24668-5

  • Online ISBN: 978-3-642-24669-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics