Warps and Atomics: Beyond Barrier Synchronization in the Verification of GPU Kernels | SpringerLink
Skip to main content

Warps and Atomics: Beyond Barrier Synchronization in the Verification of GPU Kernels

  • Conference paper
NASA Formal Methods (NFM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8430))

Included in the following conference series:

  • 1082 Accesses

Abstract

We describe the design and implementation of methods to support reasoning about data races in GPU kernels where constructs other than the standard barrier primitive are used for synchronization. At one extreme we consider kernels that exploit implicit, coarse-grained synchronization between threads in the same warp, a feature provided by many architectures. At the other extreme we consider kernels that reduce or avoid barrier synchronization through the use of atomic operations. We discuss design decisions associated with providing support for warps and atomics in GPUVerify, a formal verification tool for OpenCL and CUDA kernels. We evaluate the practical impact of these design decisions using a large set of benchmarks, showing that warps can be supported in a scalable manner, that a coarse abstraction suffices for efficient reasoning about most practical uses of atomic operations, and that a novel, refined abstraction captures an important design pattern where atomic operations are used to compute unique array indices. Our evaluation revealed two previously unknown bugs in publicly available benchmark suites.

This work was supported by the EU FP7 STREP project CARP (project number 287767), and EPSRC project EP/K011499/1, and the Imperial College London UROP scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. AMD, Inc.: AMD graphics cores next (GCN) architecture, white paper (2012)

    Google Scholar 

  2. Barnett, M., Chang, B.-Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: A modular reusable verifier for object-oriented programs. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.-P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 364–387. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Betts, A., Chong, N., Donaldson, A.F., Qadeer, S., Thomson, P.: GPUVerify: a verifier for GPU kernels. In: OOPSLA (2012)

    Google Scholar 

  4. Chiang, W.-F., Gopalakrishnan, G., Li, G., Rakamarić, Z.: Formal analysis of GPU programs with atomics via conflict-directed delay-bounding. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871, pp. 213–228. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  5. Collingbourne, P., Cadar, C., Kelly, P.H.J.: Symbolic testing of OpenCL code. In: Eder, K., Lourenço, J., Shehory, O. (eds.) HVC 2011. LNCS, vol. 7261, pp. 203–218. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Collingbourne, P.C.: Symbolic Crosschecking of Data-Parallel Floating Point Code. Ph.D. thesis, Imperial College London (2012)

    Google Scholar 

  7. Emmi, M., Qadeer, S., Rakamaric, Z.: Delay-bounded scheduling. In: POPL (2011)

    Google Scholar 

  8. Huisman, M., Mihelčić, M.: Specification and verification of GPGPU programs using permission-based separation logic. In: BYTECODE (2013)

    Google Scholar 

  9. Khronos Group: The OpenCL extension specification, version 2.0 (2013)

    Google Scholar 

  10. Khronos Group: The OpenCL specification, version 2.0 (2013)

    Google Scholar 

  11. Leino, K., Rustan, M.: This is Boogie 2 (2008), manuscript KRML 178 (2008)

    Google Scholar 

  12. Leung, A., Gupta, M., Agarwal, Y., Gupta, R., Jhala, R., Lerner, S.: Verifying GPU kernels by test amplification. In: PLDI (2012)

    Google Scholar 

  13. Li, G., Gopalakrishnan, G.: Scalable SMT-based verification of GPU kernel functions. In: FSE (2010)

    Google Scholar 

  14. Li, G., Li, P., Sawaya, G., Gopalakrishnan, G., Ghosh, I., Rajan, S.P.: GKLEE: concolic verification and test generation for GPUs. In: PPoPP. ACM (2012)

    Google Scholar 

  15. Li, P., Li, G., Gopalakrishnan, G.: Parametric flows: automated behavior equivalencing for symbolic analysis of races in CUDA programs. In: SC (2012)

    Google Scholar 

  16. NVIDIA Corporation: CUDA C programming guide, version 5.5 (2013)

    Google Scholar 

  17. Sengupta, S., Harris, M., Garland, M.: Efficient parallel scan algorithms for GPUs. Tech. Rep. NVR-2008-003, NVIDIA (2008)

    Google Scholar 

  18. Stratton, J.A., et al.: Parboil: A revised benchmark suite for scientific and commercial throughput computing. Tech. Rep. IMPACT-12-01, UIUC (2012)

    Google Scholar 

  19. Collingbourne, P., Donaldson, A.F., Ketema, J., Qadeer, S.: Interleaving and lock-step semantics for analysis and verification of GPU kernels. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS, vol. 7792, pp. 270–289. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Bardsley, E., Donaldson, A.F. (2014). Warps and Atomics: Beyond Barrier Synchronization in the Verification of GPU Kernels. In: Badger, J.M., Rozier, K.Y. (eds) NASA Formal Methods. NFM 2014. Lecture Notes in Computer Science, vol 8430. Springer, Cham. https://doi.org/10.1007/978-3-319-06200-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06200-6_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06199-3

  • Online ISBN: 978-3-319-06200-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics