Hybrid work stealing of locality-flexible and cancelable tasks for the APGAS library | The Journal of Supercomputing Skip to main content
Log in

Hybrid work stealing of locality-flexible and cancelable tasks for the APGAS library

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Since large parallel machines are typically clusters of multicore nodes, parallel programs should be able to deal with both shared memory and distributed memory. This paper proposes a hybrid work stealing scheme, which combines the lifeline-based variant of distributed task pools with the node-internal load balancing of Java’s Fork/Join framework. We implemented our scheme by extending the APGAS library for Java, which is a branch of the X10 project. APGAS programmers can now spawn locality-flexible tasks with a new asyncAny construct. These tasks are transparently mapped to any resource in the overall system, so that the load is balanced over both nodes and cores. Unprocessed asyncAny-tasks can also be cancelled. In performance measurements with up to 144 workers on up to 12 nodes, we observed near linear speedups for four benchmarks and a low overhead for cancellation-related bookkeeping.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Applegate DL, Bixby RE, Chvatal V, Cook WJ (2007) The traveling salesman problem. Princeton University Press, Princeton

    MATH  Google Scholar 

  2. Diaz J, Munoz-Caro C, Nino A (2012) A survey of parallel programming models and tools in the multi and many-core era. IEEE Trans Parallel Distrib Syst 23:1369–1386. https://doi.org/10.1109/tpds.2011.308

    Article  Google Scholar 

  3. Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40(1):35. https://doi.org/10.2307/3033543

    Article  Google Scholar 

  4. Gendron B, Crainic TG (1994) Parallel branch-and-branch algorithms: survey and synthesis. Oper Res 42(6):1042–1066. https://doi.org/10.1287/opre.42.6.1042

    Article  MathSciNet  MATH  Google Scholar 

  5. Gik J (1987) Schach und Mathematik. Deutsch Harri GmbH, Frankfurt a. M

  6. Guo Y, Zhao J, Cave V, Sarkar V (2010) SLAW: a scalable locality-aware adaptive work-stealing scheduler for multi-core systems. In: Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. https://doi.org/10.1145/1693453.1693504

  7. IBM: Core implementation of X10 programming language including compiler, runtime, class libraries, sample programs and test suite. https://github.com/x10-lang/x10 (2017)

  8. IBM: The APGAS library for fault-tolerant distributed programming in Java 8. https://github.com/x10-lang/apgas (2017)

  9. Kestor G, Krishnamoorthy S, Ma W (2017) Localized fault recovery for nested fork-join programs. In: Proceedings of IEEE International Symposium on Parallel and Distributed Processing. https://doi.org/10.1109/ipdps.2017.75

  10. Kolesnichenko A, Nanz S, Meyer B (2013) How to cancel a task. Springer, Berlin, pp 61–72. https://doi.org/10.1007/978-3-642-39955-8_6

    Google Scholar 

  11. Kumar V, Murthy K, Sarkar V, Zheng Y (2016) Optimized distributed work-stealing. In: Proceedings of Workshop on Irregular Applications: Architectures and Algorithms, pp 74–77. https://doi.org/10.1109/IA3.2016.19

  12. Olivier S, Huan J, Liu J, Prins J, Dinan J, Sadayappan P, Tseng CW (2006) UTS: an unbalanced tree search benchmark. In: Languages and Compilers for Parallel Computing, pp 235–250. Springer LNCS 4382. https://doi.org/10.1007/978-3-540-72521-3_18

  13. OpenMP ARB: OpenMP specifications. http://www.openmp.org/specifications/ (2017)

  14. Oracle: Class ForkJoinPool. https://docs.oracle.com/javase/9/docs/api/java/util/concurrent/ForkJoinPool.html (2017)

  15. Paudel J, Tardieu O, Amaral JN (2013) Hybrid parallel task placement in X10. In: Proceedings of ACM SIGPLAN Workshop on X10. https://doi.org/10.1145/2481268.2481277

  16. Paudel J, Tardieu O, Amaral JN (2013) On the merits of distributed work-stealing on selective locality-aware tasks. In: Proceedings of International Conference on Parallel Processing. https://doi.org/10.1109/icpp.2013.19

  17. Posner J, Fohry C (2016) Cooperation versus coordination for lifeline-based global load balancing in APGAS. In: Proceedings of ACM SIGPLAN Workshop on X10. https://doi.org/10.1145/2931028.2931029

  18. Rice University: HabaneroUPC++: a Compiler-free PGAS Library. https://github.com/habanero-rice/habanero-upc (2017)

  19. STEllAR-GROUP: HPX: The C++ standards library for parallelism and concurrency (2017). https://github.com/STEllAR-GROUP/hpx

  20. Tardieu O (2015) The APGAS library: resilient parallel and distributed programming in Java 8. In: Proceedings of ACM SIGPLAN Workshop on X10. https://doi.org/10.1145/2771774.2771780

  21. Thoman P, et al (2018) A taxonomy of task-based technologies for high-performance computing. In: Proceedings of International Conference Parallel Processing and Applied Mathematics (To appear)

  22. University of Kassel: Scientific data processing. https://www.uni-kassel.de/its-handbuch/en/daten-dienste/wissenschaftliche-datenverarbeitung.html (2017)

  23. Yamashita K, Kamada T (2016) Introducing a multithread and multistage mechanism for the global load balancing library of X10. J Inf Process 24(2):416–424. https://doi.org/10.2197/ipsjjip.24.416

    Google Scholar 

  24. Zhang W, Tardieu O, Grove D, Herta B, Kamada T, Saraswat V, Takeuchi M (2014) GLB lifeline-based global load balancing library in X10. In: Proceedings of ACM Workshop on Parallel Programming for Analytics Applications. https://doi.org/10.1145/2567634.2567639

Download references

Acknowledgements

This work is supported by the Deutsche Forschungsgemeinschaft, under Grant FO 1035/5-1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonas Posner.

Additional information

This paper is an extended version of Jonas Posner and Claudia Fohry: A Combination of Intra- and Inter-Place Work Stealing for the APGAS Library. Parallel Processing and Applied Mathematics Workshops (WLPP), 2017.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Posner, J., Fohry, C. Hybrid work stealing of locality-flexible and cancelable tasks for the APGAS library. J Supercomput 74, 1435–1448 (2018). https://doi.org/10.1007/s11227-018-2234-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2234-8

Keywords

Navigation