A Hierarchical CLH Queue Lock

Luchangco, Victor; Nussbaum, Dan; Shavit, Nir

doi:10.1007/11823285_84

Victor Luchangco¹⁹,
Dan Nussbaum¹⁹ &
Nir Shavit¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4128))

Included in the following conference series:

European Conference on Parallel Processing

1144 Accesses
48 Citations
9 Altmetric

Abstract

Modern multiprocessor architectures such as CC-NUMA machines or CMPs have nonuniform communication architectures that render programs sensitive to memory access locality. A recent paper by Radović and Hagersten shows that performance gains can be obtained by developing general-purpose mutual-exclusion locks that encourage threads with high mutual memory locality to acquire the lock consecutively, thus reducing the overall cost due to cache misses. Radović and Hagersten present the first such hierarchical locks. Unfortunately, their locks are backoff locks, which are known to incur higher cache miss rates than queue-based locks, suffer from various fundamental fairness issues, and are hard to tune so as to maximize locality of lock accesses.

Extending queue-locking algorithms to be hierarchical requires that requests from threads with high mutual memory locality be consecutive in the queue. Until now, it was not clear that one could design such locks because collecting requests locally and moving them into a global queue seemingly requires a level of coordination whose cost would defeat the very purpose of hierarchical locking.

This paper presents a hierarchical version of the Craig, Landin, and Hagersten CLH queue lock, which we call the HCLH queue lock. In this algorithm, threads build implicit local queues of waiting threads, splicing them into a global queue at the cost of only a single CAS operation.

In a set of microbenchmarks run on a large scale multiprocessor machine and a state-of-the-art multi-threaded multi-core chip, the HLCH algorithm exhibits better performance and significantly better fairness than the hierarchical backoff locks of Radović and Hagersten.

Download to read the full chapter text

Chapter PDF

CBPQ: High Performance Lock-Free Priority Queue

Avoiding Scalability Collapse by Restricting Concurrency

On the Design and Implementation of an Efficient Lock-Free Scheduler

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Anderson, T.: The performance implications of spin lock alternatives for shared-memory multiprocessors. IEEE Trans. Parallel and Distributed Systems 1(1), 6–16 (1990)
Article Google Scholar
Craig, T.: Building FIFO and priority-queueing spin locks from atomic swap. Technical Report TR 93-02-02, University of Washington, Dept of Computer Science (1993)
Google Scholar
Mellor-Crummey, J., Scott, M.: Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Computer Systems 9(1), 21–65 (1991)
Article Google Scholar
Magnussen, P., Landin, A., Hagersten, E.: Queue locks on cache coherent multiprocessors. In: Proc. 8th International Symposium on Parallel Processing (IPPS), pp. 165–171 (1994)
Google Scholar
Agarwal, A., Cherian, M.: Adaptive backoff synchronization techniques. In: Proc. 16th International Symposium on Computer Architecture, pp. 396–406 (1989)
Google Scholar
Radović, Z., Hagersten, E.: Hierarchical Backoff Locks for Nonuniform Communication Architectures. In: HPCA-9, Anaheim, California, USA, pp. 241–252 (2003)
Google Scholar
Sun Microsystems: Sun Fire E25K/E20K Systems Overview. Technical Report 817-4136-12, Sun Microsystems (2005)
Google Scholar
Kongetira, P., Aingaran, K., Olukotun, K.: Niagara: A 32-way multithreaded sparc processor. IEEE Micro 25(2), 21–29 (2005)
Article Google Scholar
Scott, M., Scherer, W.: Scalable queue-based spin locks with timeout. In: Proc. 8th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, pp. 44–52 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Sun Microsystems Laboratories,
Victor Luchangco, Dan Nussbaum & Nir Shavit

Authors

Victor Luchangco
View author publications
You can also search for this author in PubMed Google Scholar
Dan Nussbaum
View author publications
You can also search for this author in PubMed Google Scholar
Nir Shavit
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ZIH, TU Dresden, Germany
Wolfgang E. Nagel
Fakultät Mathematik, Institut für wissenschaftliches Rechnen, TU Dresden, 01062, Dresden, Germany
Wolfgang V. Walter
Database Technology Group, Technische Universität Dresden, Germany
Wolfgang Lehner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luchangco, V., Nussbaum, D., Shavit, N. (2006). A Hierarchical CLH Queue Lock. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds) Euro-Par 2006 Parallel Processing. Euro-Par 2006. Lecture Notes in Computer Science, vol 4128. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823285_84

Download citation

DOI: https://doi.org/10.1007/11823285_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37783-2
Online ISBN: 978-3-540-37784-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics