Abstract
We study the complexity of traversing tree-shaped workflows whose tasks require large I/O files. We target a heterogeneous architecture with two resources of different types, each equipped with its own memory, such as a multicore node equipped with a dedicated accelerator (FPGA or GPU). Tasks in the workflow are tagged with the type of resource that is needed for their processing. Besides, a task can be processed on a given resource only if all its input files and output files can be stored in the corresponding memory. At a given execution step, the amount of data stored in each memory strongly depends upon the ordering in which the tasks are executed, and upon when communications between both memories are scheduled. The objective is to determine an efficient traversal that minimizes the maximum amount of memory of each type needed to traverse the whole tree. In this paper, we establish the complexity of this two-memory scheduling problem, provide inapproximability results, and show how to determine the optimal depth-first traversal. Altogether, these results lay the foundations for memory-aware scheduling algorithms on heterogeneous platforms.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.-A.: Starpu: A unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience 23(2), 187–198 (2011)
Herrmann, J., Marchal, L., Robert, Y.: Tree traversals with task-memory affinities. Research report 8226, INRIA (2013)
Horton, M., Tomov, S., Dongarra, J.: A class of hybrid lapack algorithms for multicore and gpu architectures. In: 2011 Symposium on Application Accelerators in High-Performance Computing (SAAHPC), pp. 150–158 (July 2011)
Jacquelin, M., Marchal, L., Robert, Y., Ucar, B.: On optimal tree traversals for sparse matrix factorization. In: IPDPS 2011 (2011)
Liu, J.W.H.: On the storage requirement in the out-of-core multifrontal method for sparse factorization. ACM Trans. Math. Software 12(3), 249–264 (1986)
Liu, J.W.H.: An application of generalized tree pebbling to sparse matrix factorization. SIAM J. Algebraic Discrete Methods 8(3) (1987)
Marchal, L., Sinnen, O., Vivien, F.: Scheduling tree-shaped task graphs to minimize memory and makespan. Research report 8082, INRIA (2012); Accepted for publication in IPDPS 2013
Ramakrishnan, A., Singh, G., Zhao, H., Deelman, E., Sakellariou, R., Vahi, K., Blackburn, K., Meyers, D., Samidi, M.: Scheduling data-intensiveworkflows onto storage-constrained distributed resources. In: CCGRID 2007. IEEE (2007)
Sethi, R.: Complete register allocation problems. In: STOC 1973, pp. 182–195. ACM Press (1973)
Sethi, R., Ullman, J.: The generation of optimal code for arithmetic expressions. J. ACM 17(4), 715–728 (1970)
Tomov, S., Nath, R., Du, P., Dongarra, J.: MAGMA version User’s guide (2009), http://icl.eecs.utk.edu/magma/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Herrmann, J., Marchal, L., Robert, Y. (2013). Model and Complexity Results for Tree Traversals on Hybrid Platforms. In: Wolf, F., Mohr, B., an Mey, D. (eds) Euro-Par 2013 Parallel Processing. Euro-Par 2013. Lecture Notes in Computer Science, vol 8097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40047-6_65
Download citation
DOI: https://doi.org/10.1007/978-3-642-40047-6_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40046-9
Online ISBN: 978-3-642-40047-6
eBook Packages: Computer ScienceComputer Science (R0)