Abstract
The heterogeneous accelerated processing units (APUs) integrate a multi-core CPU and a GPU within the same chip. Modern APUs implement CPU–GPU platform atomics for simple data types. However, ensuring atomicity for complex data types is a task delegated to programmers. Transactional memory (TM) is an optimistic approach to achieve this goal. With TM, shared data can be accessed by multiple computing threads speculatively, but changes are only visible if a transaction ends with no conflict with others in its memory accesses. In this paper we present APUTM, a software TM designed for APU processors which focuses on minimizing the access to shared metadata. The main goal of APUTM is to understand the trade-offs of implementing a software TM on such platform. In our experiments, APUTM is able to outperform sequential execution of the applications. Additionally, we compare its adaptability to execute in one of the devices or in both simultaneously.
Similar content being viewed by others
References
Adir A, Goodman D et al (2014) Verification of transactional memory in power 8. In: 51st Annual Design Automation Conference (DAC’14), pp 1–6
Cederman D, Tsigas P, Chaudhry MT (2010) Towards a software transactional memory for graphics processors. In 10th Eurographics Conference on Parallel Graphics and Visualization (EG PGV’10), pp 121–129
Chen S, Peng L (2016) Efficient GPU hardware transactional memory through early conflict resolution. In: 22nd International Symposium on High Performance Computer Architecture (HPCA’16)
Dalessandro L, Scott ML (2012) Strong isolation is a weak idea. In: International Conference on Parallel Architectures and Compilation Techniques (PACT’12)
Dalessandro L, Spear MF, Scott ML (2010) NOrec: streamlining STM by abolishing ownership records. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’10, New York, NY, USA. ACM, pp 67–78
Dice D, Shalev O, Shavit N (2006) Transactional locking II. Springer, Berlin, pp 194–208
Dragojević A, Guerraoui R, Kapalka M (2009) Stretching transactional memory. In: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’09, New York, NY, USA. ACM, pp 155–165
Felber P, Fetzer C, Riegel T, Marlier P (2010) Time-based software transactional memory. IEEE Trans Parallel Distrib Syst 21:1793–1807
Fung WWL, Aamodt TM (2013) Energy efficient GPU transactional memory via space-time optimizations. In: 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13), pp 408–420
Fung WWL, Singh I, Brownsword A, Aamodt TM (2011) Hardware transactional memory for GPU architectures. In: 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11), pp 296–307
Guerraoui R, Kapalka M (2008) On the correctness of transactional memory. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’08, New York, NY, USA. ACM, pp 175–184
Harris T, Larus J, Rajwar R (2010) Transactional memory, 2nd edn. Morgan & Claypool Publishers, San Rafael
Herlihy M, Moss JEB (1993) Transactional memory: architectural support for lock-free data structures. In: 20th Annual International Symposium on Computer Architecture (ISCA’93), pp 289–300
Holey A, Zhai A (2014) Lightweight software transactions on GPUs. In: 43rd International Conference on Parallel Processing (ICPP’14), pp 461–470
Jacobi C, Siegel T, Greiner D (2012) Transactional memory architecture and implementation for IBM System z. In: 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12), pp 25–36
Ruan W, Liu Y, Spear M (2015) Transactional read-modify-write without aborts. ACM Trans Archit Code Optim 11(4):63:1–63:4
Shen Q, Sharp C, Blewitt W, Ushaw G, Morgan G (2015) PR-STM: priority rule based software transactions for the GPU. Springer, Berlin, pp 361–372
Villegas A, Asenjo R, Navarro A, Plata O, Ubal R, Kaeli D (2017) Hardware support for scratchpad memory transactions on GPU architectures. Springer, Cham, pp 273–286
Wang A, Gaudet M, Wu P, Amaral J, Ohmacht M, Barton C, Silvera R, and Michael M (2012) Evaluation of BlueGene/Q hardware support for transactional memories. In: 21st International Conference on Parallel Architectures and Compilation Techniques (PACT’12), pp 127–136
Xu Y, Wang R, Goswami N, Li T, Gao L, Qian D (2014) Software transactional memory for GPU architectures. In Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO’14), pp 1:1–1:10
Yoo RM, Hughes CJ, Lai K, Rajwar R (2013) Performance evaluation of Intel transactional synchronization extensions for high-performance computing. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC’13), pp 19:1–19:11
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been supported by projects TIN2013-42253-P and TIN2016-80920-R, from the Spanish Government, and P11-TIC8144 and P12-TIC1470, from Junta de Andalucia.
Rights and permissions
About this article
Cite this article
Villegas, A., Navarro, A., Asenjo, R. et al. Toward a software transactional memory for heterogeneous CPU–GPU processors. J Supercomput 75, 4177–4192 (2019). https://doi.org/10.1007/s11227-018-2347-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2347-0